Title

Dual-Core Execution: Building A Highly Scalable Single-Thread Instruction Window

Abstract

Current integration trends embrace the prosperity of single-chip multi-core processors. Although multi-core processors deliver significantly improved system throughput, single-thread performance is not addressed. In this paper, we propose a new execution paradigm that utilizes multi-cores on a single chip collaboratively to achieve high performance for single-thread memory-intensive workloads while maintaining the flexibility to support multithreaded applications. The proposed execution paradigm, dual-core execution, consists of two superscalar cores (a front and back processor) coupled with a queue. The front processor fetches and preprocesses instruction streams and retires processed instructions into the queue for the back processor to consume. The front processor executes instructions as usual except for cache-missing loads, which produce an invalid value instead of blocking the pipeline. As a result, the front processor runs far ahead to warm up the data caches and fix branch mispredictions for the back processor. In-flight instructions are distributed in the front processor, the queue, and the back processor, forming a very large instruction window for single-thread out-of-order execution. The proposed architecture incurs only minor hardware changes and does not require any large centralized structures such as large register files, issue queues, load/store queues, or reorder buffers. Experimental results show remarkable latency hiding capabilities of the proposed architecture, even outperforming more complex single-thread processors with much larger instruction windows than the front or back processor. © 2005 IEEE.

Publication Date

12-1-2005

Publication Title

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

Volume

2005

Number of Pages

231-242

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

Socpus ID

33644919336 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/33644919336

This document is currently not available here.

Share

COinS