Title

Hba: Distributed Metadata Management For Large Cluster-Based Storage Systems

Keywords

Distributed file systems; Distributed systems; File systems management; Parallel systems; Storage management

Abstract

An efficient and distributed scheme for file mapping or file lookup is critical in decentralizing metadata management within a group of metadata servers. This paper presents a novel technique called HBA (Hierarchical Bloom filter Arrays) to map filenames to the metadata servers holding their metadata. Two levels of probabilistic arrays, namely, Bloom filter arrays, with different level of accuracies, are used on each metadata server. One array, with lower accuracy and representing the distribution of the entire metadata, trades accuracy for significantly reduced memory overhead, while the other array, with higher accuracy, caches partial distribution information and exploits the temporal locality of file access patterns. Both arrays are replicated to all metadata servers to support fast local lookups. We evaluate HBA through extensive trace-driven simulations and an implementation in Linux. Simulation results show our HBA design to be highly effective and efficient in improving performance and scalability of file systems in clusters with 1,000 to 10,000 nodes (or super-clusters) and with the amount of data in the Peta-byte scale or higher. Our implementation indicates that HBA can reduce metadata operation time of a single-metadata-server architecture by a factor of up to 43.9 when the system is configured with 16 metadata servers. © 2008 IEEE.

Publication Date

6-1-2008

Publication Title

IEEE Transactions on Parallel and Distributed Systems

Volume

19

Issue

6

Number of Pages

750-763

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/TPDS.2007.70788

Socpus ID

44049089212 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/44049089212

This document is currently not available here.

Share

COinS