Title
Integrated Memory Controllers With Parallel Coherence Streams
Keywords
Coherence bandwidth; Directory protocol; Distributed shared memory multiprocessor; Integrated memory controller; Multiple coherence controllers
Abstract
Previous work in scalable hardware distributed shared memory (DSM) multiprocessors has established the critical and dominant role that protocol processing bandwidth (or its inverse, occupancy) plays in determining overall performance in architectures with standalone memory/ coherence controllers. However, with recent architectural trends toward integrated (on-chip) memory controllers and the well-known fact that processor frequency is increasing more rapidly than memory systems', we must ask whether parallel coherence processing engines (either multiple integrated protocol processors/cores or multiple protocol threads) are needed in DSM machines constructed from modern processor architectures, and if so, when. We construct a useful analytical model to give the designer insight into when parallel coherence streams will improve performance and verify our model via detailed simulation on 64-threaded microbenchmarks and parallel applications, and single-node multiprogrammed workloads. Surprisingly, and contrary to related work, we find that in these architectures adding a second coherence engine has almost no impact on performance. Further, for less-tuned applications that suffer from hot-spots (contentious requests to the same memory line), additional engines offer no benefit whatever. Even with double the memory bandwidth (or channels) an additional coherence processing stream yields only slight performance improvement. Only for a special class of DSM machines employing directory-less broadcast protocols over unordered interconnects, does parallel "snoop"processing offer reasonable performance improvement for communication-intensive applications. Overall, given the architectural trends, this is good news for DSM designers that want to minimize the resources necessary (protocol threads or integrated protocol processor cores for maintaining inter-node coherence, respectively) to create SMTp-based or multi-CMP-based scalable directory-based DSM machines. © 2007 IEEE.
Publication Date
8-1-2007
Publication Title
IEEE Transactions on Parallel and Distributed Systems
Volume
18
Issue
8
Number of Pages
1159-1173
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/TPDS.2007.1044
Copyright Status
Unknown
Socpus ID
34548275937 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/34548275937
STARS Citation
Chaudhuri, Mainak and Heinrich, Mark, "Integrated Memory Controllers With Parallel Coherence Streams" (2007). Scopus Export 2000s. 6444.
https://stars.library.ucf.edu/scopus2000/6444