Deister: A Light-Weight Autonomous Block Management In Data-Intensive File Systems Using Deterministic Declustering Distribution
Abstract
During the last few decades, Data-intensive File Systems (DiFS), such as Google File System (GFS) and Hadoop Distributed File System (HDFS) have become the key storage architectures for big data processing. These storage systems usually divide files into fixed-sized blocks (or chunks). Each block is replicated (usually three-way) and distributed pseudo-randomly across the cluster. The master node (namenode) uses a huge table to record the locations of each block and its replicas. However, with the increasing size of the data, the block location table and its corresponding maintenance could occupy more than half of the memory space and 30% of processing capacity in master node, which severely limit the scalability and performance of master node. We argue that the physical data distribution and maintenance should be separated out from the metadata management and performed by each storage node autonomously. In this paper, we propose Deister, a novel block management scheme that is built on an invertible deterministic declustering distribution method called Intersected Shifted Declustering (ISD). Deister is amendable to current research on scaling the namespace management in master node. In Deister, the huge table for maintaining the block locations in the master node are eliminated and the maintenance of the block-node mapping is performed autonomously on each data node. Results show that as compared with the HDFS default configuration, Deister is able to achieve identical performance with a saving of about half of the RAM space in master node and is expected to scale to double the size of current single namenode HDFS cluster, pushing the scalability bottleneck of master node back to namespace management.
Publication Date
1-1-2015
Publication Title
Proceedings - 2015 IEEE International Conference on Smart City, SmartCity 2015, Held Jointly with 8th IEEE International Conference on Social Computing and Networking, SocialCom 2015, 5th IEEE International Conference on Sustainable Computing and Communications, SustainCom 2015, 2015 International Conference on Big Data Intelligence and Computing, DataCom 2015, 5th International Symposium on Cloud and Service Computing, SC2 2015
Number of Pages
598-604
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/SmartCity.2015.135
Copyright Status
Unknown
Socpus ID
84973883037 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84973883037
STARS Citation
Zhang, Xuhong; Yin, Jiangling; Wang, Jun; Wang, Ruijun; and Huang, Dan, "Deister: A Light-Weight Autonomous Block Management In Data-Intensive File Systems Using Deterministic Declustering Distribution" (2015). Scopus Export 2015-2019. 2019.
https://stars.library.ucf.edu/scopus2015/2019