IJARP SJIF(2018): 4.908

International Journal of Advanced Research and Publications!

Memory-Based Multi-Processing Method For Big Data Computation

Volume 3 - Issue 3, March 2019 Edition
[Download Full Paper]

Author(s)
Youssef Bassil
Keywords
Big-Data, Multi-processing, Shared Memory
Abstract
The evolution of the Internet and computer applications have generated colossal amount of data. They are referred to as Big Data and they consist of huge volume, high velocity, and variable datasets that need to be managed at the right speed and within the right time frame to allow real-time data processing and analysis. Several Big Data solutions were developed, however they are all based on distributed computing which can be sometimes expensive to build, manage, troubleshoot, and secure. This paper proposes a novel method for processing Big Data using memory-based, multi-processing, and one-server architecture. It is memory-based because data are loaded into memory prior to start processing. It is multi-processing because it leverages the power of parallel programming using shared memory and multiple threads running over several CPUs in a concurrent fashion. It is one-server because it only requires a single server that operates in a non-distributed computing environment. The foremost advantages of the proposed method are high performance, low cost, and ease of management. The experiments conducted showed outstanding results as the proposed method outperformed other conventional methods that currently exist on the market. Further research can improve upon the proposed method so that it supports message passing between its different processes using remote procedure calls among other techniques.
References
[1] Andrea De Mauro, Marco Greco, Michele Grimaldi, "A Formal definition of Big Data based on its essential Features", Library Review, vol. 65, no.3, pp. 122–135, 2016

[2] Ceylan Onay, Elif Öztürk, "A review of credit scoring research in the age of Big Data", Journal of Financial Regulation and Compliance, vol. 26, no. 3, pp.382–405, 2018.

[3] "IBM what is big data? – Bringing big data to the enterprise", www.ibm.com. Retrieved 26 August 2013.

[4] Makrufa Hajirahimova, Aybeniz Aliyeva, "About Big Data Measurement Methodologies and Indicators", International Journal of Modern Education and Computer Science, vol. 9, no. 10, pp.1–9, 2017

[5] O.J. Reichman, M.B. Jones, M.P. Schildhauer, "Challenges and Opportunities of Open Data in Ecology", Science, vol. 331, no. 6018, pp. 703–5, 2011

[6] Boyd, D., Crawford, K., “Critical Questions for Big Data”, Information, Communication & Society, Vol. 15 Issue 5 pp. 662, 2012.

[7] Toby Segaran, Jeff Hammerbacher, "Beautiful Data: The Stories Behind Elegant Data Solutions", O'Reilly Media, ISBN 978-0-596-15711-1, 2009

[8] Snijders, C.; Matzat, U.; Reips, U.-D., "Big Data: Big gaps of knowledge in the field of Internet", International Journal of Internet Science, vol. 7, pp. 1–5, 2012.

[9] Thomas H. Davenport, “Big Data at Work: Dispelling the Myths, Uncovering the Opportunities”, Harvard Business Review Press, 2014.

[10] Marozzo, F., Talia, D., Trunfio, P., “P2P-MapReduce: Parallel data processing in dynamic Cloud environments”, Journal of Computer and System Sciences, Vol. 78 Issue. 5, pp.1382, 2012.

[11] Aswini Kumar, Andrew Whitchcock, “Google's BigTable”, Google Research, pp. 20-55, 2005.

[12] Ashlee Vance, “Hadoop, a Free Software Program, Finds Uses Beyond Search”, The New York Times, Retrieved 2010-01-20.

[13] Lämmel, R, "Google's Map Reduce Programming Model", Science of Computer Programming, vol. 70, pp. 1–30, 2008

[14] Fay Chang et al, "Bigtable: A Distributed Storage System for Structured Data", ACM Transactions on Computer Systems, vol. 26, no. 2, 2008

[15] S. Ghemawat, H. Gobioff, S.T. Leung, "The Google file system", Proceedings of the nineteenth ACM Symposium on Operating Systems Principles, vol 1, p. 29, 2003.

[16] J. Bentley, D. McIlroy, "Data compression using long common strings", Proc. IEEE Data Compression Conference, pp.287-295, 1999.

[17] Lam, Chuck, "Hadoop in Action", 1st ed., Manning Publications, ISBN 1935182196, 2010

[18] David A. Patterson and John L. Hennessy, “Computer Organization and Design, the Hardware/Software Interface” 5th Edition, The Morgan Kaufmann Series, 2013.

[19] Anany Levitin, “Introduction to the Design and Analysis of Algorithms”, 3rd Edition, Addison-Wesley, 2011.

[20] John Ford, “Multiprocessing”, BYTE magazine, Vol. 10, Issue. 05, pp. 169, 1985.

[21] Intel Cooperation, Intel® Xeon® Processor E7-8890 v4, "https://www.intel.com/content/www/us/en/ products/processors/xeon/e7-processors/e7-8890-v4.html", retrieved Feb 2019

[22] Boja, C, Pocovnicu, A, Batagan, L, “Distributed Parallel Architecture for Big Data”, Informatica Economica, Vol. 16 Issue. 2, pp. 116–127, 2012.

[23] B. Wilkinson and M. Allen, “Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers”, 2nd Edition, Prentice Hall, 2004.