An Improved Replica Placement Policy For Hadoop Distributed File System Running On Cloud Platforms

Keywords

Cloud Computing; Data-Intensive Computing; Hadoop; Hadoop Distributed File System; Load Balance; MapReduce; Replica Placement

Abstract

Load balance is a crucial issue for data-intensive computing on cloud platforms, because a load balanced cluster can significantly improve the completion time of data-intensive jobs. In this paper, we present an improved replica placement policy for Hadoop Distributed File System (HDFS), which is specifically designed for heterogeneous clusters. The HDFS replica placement policy cannot generate balanced replica assignment, and hence has to rely on a load balance utility to balance the load among cluster nodes. In contrast, our proposed policy can generate perfectly even replica assignment, and also achieve load balance among cluster nodes in any heterogeneous or homogeneous environments without the running of the load balance utility.

Publication Date

7-20-2017

Publication Title

Proceedings - 4th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2017 and 3rd IEEE International Conference of Scalable and Smart Cloud, SSC 2017

Number of Pages

270-275

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/CSCloud.2017.65

Socpus ID

85028632451 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85028632451

This document is currently not available here.

Share

COinS