Title
Performance Of Scalable Shared-Memory Architectures
Abstract
Analytical models were developed and simulations of memory latency were performed for Uniform Memory Access (UMA), Non-Uniform Memory Access (NUMA), Local-Remote-Global (LRG), and RCR architectures for hit rates from 0.1 to 0.9 in steps of 0.1, memory access times of 10 to 100 ns, proportions of read/write access from 0.01 to 0.1, and block sizes of 8 to 64 words. The RCR architecture provides favorable performance over UMA and NUMA architectures for all ranges of application and system parameters. RCR outperforms LRG architectures when the hit rates of the processor cache exceed 80% and replicated memory exceed 25%. Thus, inclusion of a small replicated memory at each processor significantly reduces expected access time since all replicated memory hits become independent of global traffic. For configurations of up to 32 processors, results show that latency is further reduced by distinguishing burst-mode transfers between isolated memory accesses and those which are incrementally outside the working set.
Publication Date
1-1-2000
Publication Title
Journal of Circuits, Systems and Computers
Volume
10
Issue
1-2
Number of Pages
1-22
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1016/s0218-1266(00)00006-8
Copyright Status
Unknown
Socpus ID
0346738937 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/0346738937
STARS Citation
Motlagh, Bahman S. and DeMara, Ronald F., "Performance Of Scalable Shared-Memory Architectures" (2000). Scopus Export 2000s. 1017.
https://stars.library.ucf.edu/scopus2000/1017