Improving Mapreduce Performance With Progress And Feedback Based Speculative Execution

Keywords

Cloud Computing; Data-Intensive Computing; MapReduce; Parallel and Distributed Processing; Speculative Execution; Straggler Identification

Abstract

Task stragglers dramatically impede parallel job execution of data-intensive computing in Cloud Datacenters Due to the uneven distribution of input data resulted from heterogeneous data nodes, resource contention situations, and network configurations, it causes delay failures due to the violation of job completion time. However, data-intensive computing frameworks, such as MapReduce or Hadoop, employ a mechanism called speculative execution to deal with the straggler issue, speculative execution provide limited effectiveness because in many cases straggler identification occurs too late within a job lifecycle. Identifying the straggler and the timing of identifying it is very important for Straggler mitigation in Data-intensive cloud computing. Speculative execution method is a widely adopted as a straggler identification and mitigation scheme but it has certain inherent limitations. In this paper, we strive to make Hadoop more efficient in cloud environments. We present Progress and Feedback based Speculative Execution Algorithm (PFSE), a new Straggler identification scheme to identify the straggler MapReduce tasks based on the feedback information received from completed tasks beside the progress of the currently processing task, our extensive simulation shows that PFSE can outperform the dynamic scheduling techniques like Self-Learning MapReduce scheduler (SLM) and LATE. PFSE can assist in enhancing straggler Identification and mitigation for tolerating late-timing failures within data intensive cloud computing.

Publication Date

11-22-2017

Publication Title

Proceedings - 2nd IEEE International Conference on Smart Cloud, SmartCloud 2017

Number of Pages

120-125

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/SmartCloud.2017.25

Socpus ID

85041663778 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85041663778

This document is currently not available here.

Share

COinS