Title
Memory latency in distributed shared-memory multiprocessors
Abstract
Analytical models were developed and simulations of memory latency were performed for Uniform Memory Access (UMA), Non-Uniform Memory Access (NUMA), Local-Remote-Global (LRG), and Replicated Concurrent-Read (RCR) architectures for hit rates from 0.1 to 0.9 in steps of 0.1, memory access times of 10 nsec to 100 nsec, proportions of read/write access from 0.01 to 0.1, and block sizes of 8 to 64 words. The RCR architecture based on redundant inexpensive DRAM is shown to provide favorable performance over UMA and NUMA architectures for application and system parameters in the range evaluated. RCR outperforms LRG architectures when the hit rates of the processor cache exceed 80% and hit rates of replicated memory exceed 25%. Inclusion of a small replicated memory at each processor significantly reduces expected access time since all replicated memory READ access hits become independent of global traffic. For configurations of up to 32 processors, results show that latency is further reduced by distinguishing burst-mode transfers between isolated memory accesses and those which are incrementally outside the working set.
Publication Date
1-1-1998
Publication Title
Conference Proceedings - IEEE SOUTHEASTCON
Number of Pages
134-137
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
Copyright Status
Unknown
Socpus ID
0031698735 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/0031698735
STARS Citation
Motlagh, Bahman S. and DeMara, Ronald F., "Memory latency in distributed shared-memory multiprocessors" (1998). Scopus Export 1990s. 3401.
https://stars.library.ucf.edu/scopus1990/3401