An Improved Straggler Identification Scheme For Data-Intensive Computing On Cloud Platforms

Keywords

Cloud Computing; Data-Intensive Computing; Parallel and Distributed Processing; Speculative Execution; Straggler Identification; Tukey's method

Abstract

One of the challenges faced by data-intensive computing is the problem of stragglers, which can significantly increase the job completion time. Various proactive and reactive straggler mitigation techniques have been developed to address the problem. The straggler identification scheme is a crucial part of the straggler mitigation techniques, as only when stragglers are detected not only correctly but also early enough, the improvement in job completion time can make a real difference. Although the classical standard deviation method is a widely adopted straggler identification scheme, it is not an ideal solution due to certain inherent limitations. In this paper, we present Tukey's method, another statistical method for outlier detection, which is more suitable for the identification of stragglers for two reasons. First, it is robust to extreme observations from stragglers. Second, it can identify stragglers and, more importantly, start speculative execution earlier than the standard deviation method. Our extensive simulation results confirm that Tukey's method can remarkably outperform the standard deviation method.

Publication Date

7-20-2017

Publication Title

Proceedings - 4th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2017 and 3rd IEEE International Conference of Scalable and Smart Cloud, SSC 2017

Number of Pages

211-216

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/CSCloud.2017.64

Socpus ID

85028664817 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85028664817

This document is currently not available here.

Share

COinS