An Improved Replica Placement Policy For Hadoop Distributed File System Running On Cloud Platforms
Keywords
Cloud Computing; Data-Intensive Computing; Hadoop; Hadoop Distributed File System; Load Balance; MapReduce; Replica Placement
Abstract
Load balance is a crucial issue for data-intensive computing on cloud platforms, because a load balanced cluster can significantly improve the completion time of data-intensive jobs. In this paper, we present an improved replica placement policy for Hadoop Distributed File System (HDFS), which is specifically designed for heterogeneous clusters. The HDFS replica placement policy cannot generate balanced replica assignment, and hence has to rely on a load balance utility to balance the load among cluster nodes. In contrast, our proposed policy can generate perfectly even replica assignment, and also achieve load balance among cluster nodes in any heterogeneous or homogeneous environments without the running of the load balance utility.
Publication Date
7-20-2017
Publication Title
Proceedings - 4th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2017 and 3rd IEEE International Conference of Scalable and Smart Cloud, SSC 2017
Number of Pages
270-275
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/CSCloud.2017.65
Copyright Status
Unknown
Socpus ID
85028632451 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85028632451
STARS Citation
Dai, Wei; Ibrahim, Ibrahim; and Bassiouni, Mostafa, "An Improved Replica Placement Policy For Hadoop Distributed File System Running On Cloud Platforms" (2017). Scopus Export 2015-2019. 7114.
https://stars.library.ucf.edu/scopus2015/7114