Unio: A Unified I/O System Framework For Hybrid Scientific Workflow

Keywords

Data migration; Data-intensive; HDFS; HPC; MPI; Scientific workflow

Abstract

Recent years have seen an increasing number of Hybrid Scientific Applications. They often consist of one HPC simulation program along with its corresponding data analytics programs. Unfortunately, current computing platform settings do not accommodate this emerging workflow very well. This is mainly because HPC simulation programs store output data into a dedicated storage cluster equipped with Parallel File System (PFS). To perform analytics on data generated by simulation, data has to be migrated from storage cluster to compute cluster. This data migration could introduce severe delay which is especially true given an ever-increasing data size. While the scale-up supercomputers equipped with dedicated PFS storage cluster still represent the mainstream HPC, ever increasing scale- out small-medium sized HPC clusters have been supplied to facilitate hybrid scientific workflow applications in fast-growing cloud computing infrastructures such as Amazon cluster compute instances. Different from traditional supercomputer setting, the limited network bandwidth in scale-out HPC clusters makes the data migration prohibitively expensive. To attack the problem, we develop a Unified I/O System Framework (UNIO) to avoid such migration overhead for scale-out small-medium sized HPC clusters. Our main idea is to enable both HPC simulation programs and analytics programs to run atop one unified file system, e.g. data-intensive file system (DIFS in brief). In UNIO, an I/O middle-ware component allows original HPC simulation programs to execute direct I/O operations over DIFS without any porting effort, while an I/O scheduler dynamically smoothes out both disk write and read traffic for both simulation and analysis programs. By experimenting with a real-world scientific workflow over a 46-node UNIO prototype, we found that UNIO is able to achieve comparable read/write I/O performance in small-medium sized HPC clusters equipped with parallel file system. More importantly, since UNIO completely avoids the most expensive data movement overhead, it achieves up to 3x speedups for hybrid scientific workflow applications compared with current solutions.

Publication Date

1-1-2015

Publication Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

9106

Number of Pages

99-114

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1007/978-3-319-28430-9_8

Socpus ID

84955254098 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/84955254098

This document is currently not available here.

Share

COinS