Title
Active Memory Techniques For Ccnuma Multiprocessors
Abstract
Our recent work on uniprocessor and single-node multiprocessor (SMP) active memory systems uses address remapping techniques in conjunction with extended cache coherence protocols to improve access locality in processor caches. We extend our previous work in this paper and introduce the novel concept of multi-node active memory systems. We present the design of multi-node active memory cache coherence protocols to help reduce remote memory latency and improve scalability of matrix transpose and parallel reduction on distributed shared memory (DSM) multiprocessors. We evaluate our design on seven applications through execution-driven simulation on small and medium-scale multiprocessors. On a 32-processor system, an active-memory optimized matrix transpose attains speedup from 1.53 to 2.01 while parallel reduction achieves speedup from 1.19 to 2.81 over normal parallel executions.
Publication Date
1-1-2003
Publication Title
Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2003
Number of Pages
10-
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/IPDPS.2003.1213085
Copyright Status
Unknown
Socpus ID
56749173130 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/56749173130
STARS Citation
Kim, Daehyun; Chaudhuri, Mainak; and Heinrich, Mark, "Active Memory Techniques For Ccnuma Multiprocessors" (2003). Scopus Export 2000s. 1993.
https://stars.library.ucf.edu/scopus2000/1993