A Survey Of Semantics-Aware Performance Optimization For Data-Intensive Computing
Keywords
Big Data; Compiler-based; Data-Intensive Computing; Hadoop; MapReduce; Performance Optimization; Semantics-Aware; Spark
Abstract
We are living in the era of Big Data and witnessing the explosion of data. Given that the limitation of CPU and I/O in a single computer, the mainstream approach to scalability is to distribute computations among a large number of processing nodes in a cluster or cloud. This paradigm gives rise to the term of data-intensive computing, which denotes a data parallel approach to process massive volume of data. Through the efforts of different disciplines, several promising programming models and a few platforms have been proposed for data-intensive computing, such as MapReduce, Hadoop, Apache Spark and Dyrad. Even though a large body of research work has being proposed to improve overall performance of these platforms, there is still a gap between the actual performance demand and the capability of current commodity systems. This paper is aimed to provide a comprehensive understanding about current semantics-aware approaches to improve the performance of data-intensive computing. We first introduce common characteristics and paradigm shifts in the evolution of data-intensive computing, as well as contemporary programming models and technologies. We then propose four kinds of performance defects and survey the state-of-the-art semantics-aware techniques. Finally, we discuss the research challenges and opportunities in the field of semantics-aware performance optimization for data-intensive computing.
Publication Date
3-29-2018
Publication Title
Proceedings - 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing, 2017 IEEE 15th International Conference on Pervasive Intelligence and Computing, 2017 IEEE 3rd International Conference on Big Data Intelligence and Computing and 2017 IEEE Cyber Science and Technology Congress, DASC-PICom-DataCom-CyberSciTec 2017
Volume
2018-January
Number of Pages
81-88
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/DASC-PICom-DataCom-CyberSciTec.2017.28
Copyright Status
Unknown
Socpus ID
85037993166 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85037993166
STARS Citation
Rao, Bingbing and Wang, Liqang, "A Survey Of Semantics-Aware Performance Optimization For Data-Intensive Computing" (2018). Scopus Export 2015-2019. 10568.
https://stars.library.ucf.edu/scopus2015/10568