Results 1 -
8 of
8
Vector Microprocessors
- In Hot Chips VII
, 1998
"... Vector Microprocessors by Krste Asanovic Doctor of Philosophy in Computer Science University of California, Berkeley Professor John Wawrzynek, Chair Most previous research into vector architectures has concentrated on supercomputing applications and small enhancements to existing vector superc ..."
Abstract
-
Cited by 62 (4 self)
- Add to MetaCart
Vector Microprocessors by Krste Asanovic Doctor of Philosophy in Computer Science University of California, Berkeley Professor John Wawrzynek, Chair Most previous research into vector architectures has concentrated on supercomputing applications and small enhancements to existing vector supercomputer implementations. This thesis expands the body of vector research by examining designs appropriate for single-chip full-custom vector microprocessor implementations targeting a much broader range of applications. I present the design, implementation, and evaluation of T0 (Torrent-0): the first single-chip vector microprocessor. T0 is a compact but highly parallel processor that can sustain over 24 operations per cycle while issuing only a single 32-bit instruction per cycle. T0 demonstrates that vector architectures are well suited to full-custom VLSI implementation and that they perform well on many multimedia and human-machine interface tasks. The remainder of the thesis contains ...
Compiler-Controlled Memory
- In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems
, 1998
"... Optimizations aimed at reducing the impact of memory operations on execution speed have long concentrated on improving cache performance. These efforts achieve a. reasonable level of success. The primary limit on the compiler’s ability to improve memory behavior is its im-perfect knowledge about the ..."
Abstract
-
Cited by 46 (0 self)
- Add to MetaCart
Optimizations aimed at reducing the impact of memory operations on execution speed have long concentrated on improving cache performance. These efforts achieve a. reasonable level of success. The primary limit on the compiler’s ability to improve memory behavior is its im-perfect knowledge about the run-time behavior of the program. The compiler cannot completely predict run-time access patterns. There is an exception to this rule. During the reg-ister allocation phase, the compiler often must insert substantial amount,s of spill code; that is, instructions that move values from registers to memory and back again. Because the compiler itself inserts these memory instructions, it has more knowledge about them than other memory operations in the program. Spill-code operations are disjoint from the memory manipulations required by the semantics of the program being compiled, and, indeed, the two can interfere in the cache. This paper proposes a hardware solution to the problem of increased spill costs-a small compiler-con-trolled memory (CCM) to hold spilled values. This small random-access memory can (and should) be placed in a distinct address space from the main memory hierar-chy. The compiler can target spill instructions to use the CCM, moving most compiler-inserted memory traf-fic out of the pathway to main memory and eliminating any impact that those spill instructions would have on the state of the main memory hierarchy. Such mem-ories already exist on some DSP microprocessors. Our techniques can be applied directly on those chips. This paper presents two compiler-based methods to exploit such a memory, along with experimental results showing that speedups from using CCM may be sizable. It shows that using the register allocation’s coloring paradigm to assign spilled values to memory can greatly reduce the amount of memory required by a program. Permtsslon to make dIgItal or hard copies of all or part of this work for personal or classroom “se IS granted wthout fee prowded that copnes are not made or distributed for profn or commercial advan-tage and that copws bear thts notice and the full c!tatmn on the Hurst page.
Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors
, 1996
"... The memory bandwidth demands of modern microprocessors require the use of a multi-ported cache to achieve peak performance. However, multi-ported caches are costly to implement. In this paper we propose techniques for improving the bandwidth of a single cache port by using additional buffering in th ..."
Abstract
-
Cited by 35 (2 self)
- Add to MetaCart
The memory bandwidth demands of modern microprocessors require the use of a multi-ported cache to achieve peak performance. However, multi-ported caches are costly to implement. In this paper we propose techniques for improving the bandwidth of a single cache port by using additional buffering in the processor, and by taking maximum advantage of a wider cache port. We evaluate these techniques using realistic applications that include the operating system. Our techniques using a single-ported cache achieve 91 % of the performance of a dual-ported cache. 1.
Strategic Directions in Computer Architecture
- ACM Computing Surveys
, 1996
"... Looking back on the last 30 years, we have seen the remarkable developments in semiconductor technology enabling the implementation of ideas that were previously ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Looking back on the last 30 years, we have seen the remarkable developments in semiconductor technology enabling the implementation of ideas that were previously
Reducing the Impact of Spill Code
"... This memory would be as fast as cache memory, but it would be under the control of the compiler rather than the hardware. We use the results from the memory allocation study to show that this memory space could be quite small, and we present an algorithm that the compiler could employ to utilize thi ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This memory would be as fast as cache memory, but it would be under the control of the compiler rather than the hardware. We use the results from the memory allocation study to show that this memory space could be quite small, and we present an algorithm that the compiler could employ to utilize this space. We also present experimental results that suggest that this method would have a significant impact on a program's runtime
Complementary GaAs Technology for a GHz Microprocessor
- Proceedings of the 1996 GaAs IC Symposium
, 1996
"... A DARPA-funded project at the University of Michigan has as a goal the development of technologies and tools needed to implement microprocessors that can be clocked at GHz speeds. A Complementary GaAs HIGFET technology from the Motorola CS-1 facility (CGaAs) is the target semiconductor process. Whil ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A DARPA-funded project at the University of Michigan has as a goal the development of technologies and tools needed to implement microprocessors that can be clocked at GHz speeds. A Complementary GaAs HIGFET technology from the Motorola CS-1 facility (CGaAs) is the target semiconductor process. While this technology is immature, it is years ahead of CMOS in terms of fast gate delay at low power supply voltages. A major focus of this work is advanced packaging, which supports partitioning of the design into multiple integrated circuits, each having an integration level that should be achievable in CGaAs. This paper touches on the major aspects of the project, process technology, circuit design, packaging, architecture, CAD tools and software, with an emphasis on application of the CGaAs technology. I. INTRODUCTION As the 25 th anniversary of the development of the microprocessor is observed in 1996, there is no denying the impact microprocessors have had on both the scientific world ...
Exploring Design Alternatives for a Highly-Integrated, Wide-Issue, Microprocessor-Based System
"... : We present a methodology for comprehensively evaluating architectural and technological alternatives of the processor, cache hierarchy, system interconnect, and main memory subsystems. We use the methodology to explore the design of an 8-way superscalar microprocessor implemented in 0.3 micron tec ..."
Abstract
- Add to MetaCart
: We present a methodology for comprehensively evaluating architectural and technological alternatives of the processor, cache hierarchy, system interconnect, and main memory subsystems. We use the methodology to explore the design of an 8-way superscalar microprocessor implemented in 0.3 micron technology and employed in a high-performance workstation. We discuss the cache hierarchy design challenges encountered with a highly-integrated wide-issue processor, and evaluate new approaches to multi-porting first level data caches and pipelining large on-chip second level caches. Exploring Design Alternatives for a Highly-Integrated, Wide-Issue, Microprocessor-Based System Abstract: We present a methodology for comprehensively evaluating architectural and technological alternatives of the processor, cache hierarchy, system interconnect, and main memory subsystems. We use the methodology to explore the design of an 8-way superscalar microprocessor implemented in 0.3 micron technology and...
Loop Optimization Techniques On Multi-Issue Architectures
, 1994
"... CONTENTS ACKNOWLEDGMENTS.................................................................................................. iii LIST OF TABLES ............................................................................................................. vi LIST OF FIGURES .......................... ..."
Abstract
- Add to MetaCart
CONTENTS ACKNOWLEDGMENTS.................................................................................................. iii LIST OF TABLES ............................................................................................................. vi LIST OF FIGURES .......................................................................................................... vii CHAPTER I INTRODUCTION ...............................................................................................................1 1 Scheduling....................................................................................................2 2 Methodology. ...............................................................................................5 3 Research Contributions ..............................................................................12 4 Thesis Organization ...................................................................................13 CHAPTER II INSTRUCTION

