Title

Nohaa: A Novel Framework For Hpc Analytics Over Windows Azure

Keywords

Azure; Co-located computation and storage; Data-intensive; Hadoop; HDFS; HPC analytics; MapReduce

Abstract

HPC analytics has become increasingly vital to analyze the large volumes of data produced by sophisticated computing instruments. Meanwhile, with the successful development of cloud computing, more and more scientists are devoted to deploy HPC analytics in the ever-popular clouds, which poses new challenges mainly caused by different storage architectures, resource management mechanisms and programming APIs. Firstly, there exists a "data semantics" gap between the way data are stored by Cloud platform and the way data will be accessed by the HPC Analytics. Secondly, data are mostly distributed across data nodes for in-house data-intensive clusters to achieve colocated computation and storage, however, it is challenging for the public clouds to mimic because their data are stored centrally. In this paper, we develop a new HPC analytics framework called NOHAA, to provide 1) a semantics-aware intelligent data upload interface and 2) a locality-aware hierarchical storage system in support of co-located computation and storage on Windows Azure. Our extensive real world experiments show that NOHAA significantly reduces the average data access time by up to 85% and accelerates the HPC analytics execution time by a factor of 2 to 7. © 2012 IEEE.

Publication Date

12-1-2012

Publication Title

Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS

Number of Pages

448-455

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/ICPADS.2012.68

Socpus ID

84874051909 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/84874051909

This document is currently not available here.

Share

COinS