Results 1 -
3 of
3
Scalable instruction-level parallelism
- In Proc. Computer Systems: Architectures, Modeling and Simulation, 3rd and 4th Int. Workshops, SAMOS 2004, Samos
, 2004
"... Abstract. This paper presents a model for instruction-level distributed computing that allows the implementation of scalable chip multiprocessors. Based on explicit microthreading it serves as a replacement for outof-order instruction issue; it defines the model and explores implementations issues. ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
Abstract. This paper presents a model for instruction-level distributed computing that allows the implementation of scalable chip multiprocessors. Based on explicit microthreading it serves as a replacement for outof-order instruction issue; it defines the model and explores implementations issues. The model results in a fully distributed implementation in which data is distributed to one register file per processor, which is scalable as the number of ports in each register file is constant. The only component with less than ideal scaling properties is the the switching network between processors. 1 Some Issues in Current Microprocessor Design Over the last twelve years Moore predicts a packing density increase of 256 in silicon die with a corresponding speed increase of 16. Whereas we see speed increases better than predicted, the same is not true of system-level concurrency. The history of the PPC processor (see
Multi-threaded microprocessors evolution or revolution
- In Sedukhin Omondo, editor, ACSAC 2003: Advances in Computer Systems Architecture
, 2003
"... Abstract. Threading in microprocessors is not new, the earliest threaded processor design was implemented in the late 1970s and yet only now is it being used in mainstream microprocessor architecture. This paper reviews threaded microprocessors and explains why the more popular option of outof-order ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Abstract. Threading in microprocessors is not new, the earliest threaded processor design was implemented in the late 1970s and yet only now is it being used in mainstream microprocessor architecture. This paper reviews threaded microprocessors and explains why the more popular option of outof-order execution has a poor future and is not likely to provide a pathway for future microprocessor scalability. The first mainstream threaded architectures are beginning to emerge but unfortunately based on out-of-order execution. This paper will review the relevant trends in multi-threaded microprocessor design and look at one approach in detail, showing how wide instruction issue can be achieved and how it can provide excellent performance, latency tolerance and above all scalability with issue width. This model exploits ILP and loop level parallelism using a vector-like instruction set in a chip multiprocessor. 1 The Forces at Play in ISA Design There are two forces that determine the form and function of microprocessor architecture
Parallel Processing Letters ❢c World Scientific Publishing Company MICROTHREADING A MODEL FOR DISTRIBUTED INSTRUCTION-LEVEL CONCURRENCY
, 2004
"... Communicated by Kemal Ebcio˘glu This paper analyses the micro-threaded model of concurrency making comparisons with both data and instruction-level concurrency. The model is fine grain and provides synchronisation in a distributed register file, making it a promising candidate for scalable chip-mult ..."
Abstract
- Add to MetaCart
Communicated by Kemal Ebcio˘glu This paper analyses the micro-threaded model of concurrency making comparisons with both data and instruction-level concurrency. The model is fine grain and provides synchronisation in a distributed register file, making it a promising candidate for scalable chip-multiprocessors. The micro-threaded model was first proposed in 1996 as a means to tolerate high latencies in data-parallel, distributed-memory multi-processors. This paper explores the model’s opportunity to provide the simultaneous issue of instructions, required for chip multiprocessors, and discusses the issues of scalability with regard to support structures implementing the model and communication in supporting it. The model supports deterministic distribution of code fragments and dynamic scheduling of instructions from within those fragments. The hardware also recognises different classes of variables from the register specifiers, which allows the hardware to manage locality and optimise communication so that it is both efficient and scalable.

