• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Boolean Matching for Full-Custom ECL Gates (1993)

by Robert N Mayo, Herve Touati
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 16
Next 10 →

Limits of instruction-level parallelism

by David W. Wall , 1991
"... research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There two other research laboratories located in Palo Al ..."
Abstract - Cited by 339 (7 self) - Add to MetaCart
research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There two other research laboratories located in Palo Alto, the Network Systems

Shared memory consistency models: A tutorial

by Sarita V. Adve, Kourosh Gharachorloo - IEEE Computer , 1996
"... Parallel systems that support the shared memory abstraction are becoming widely accepted in many areas of computing. Writing correct and efficient programs for such systems requires a formal specification of memory semantics, called a memory consistency model. The most intuitive model—sequential con ..."
Abstract - Cited by 297 (8 self) - Add to MetaCart
Parallel systems that support the shared memory abstraction are becoming widely accepted in many areas of computing. Writing correct and efficient programs for such systems requires a formal specification of memory semantics, called a memory consistency model. The most intuitive model—sequential consistency—greatly restricts the use of many performance optimizations commonly used by uniprocessor hardware and compiler designers, thereby reducing the benefit of using a multiprocessor. To alleviate this problem, many current multiprocessors support more relaxed consistency models. Unfortunately, the models supported by various systems differ from each other in subtle yet important ways. Furthermore, precisely defining the semantics of each model often leads to complex specifications that are difficult to understand for typical users and builders of computer systems. The purpose of this tutorial paper is to describe issues related to memory consistency models in a way that would be understandable to most computer professionals. We focus on consistency models proposed for hardware-based shared-memory systems. Many of these models are originally specified with an emphasis on the system optimizations they allow. We retain the system-centric emphasis, but use uniform and simple terminology to describe the different models. We also briefly discuss an alternate programmer-centric view that describes the models in terms of program behavior rather than specific system optimizations. 1

An enhanced access and cycle time model for on-chip caches

by Steven J. E. Wilton , 1994
"... research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There is a second research laboratory located in Palo Al ..."
Abstract - Cited by 230 (5 self) - Add to MetaCart
research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There is a second research laboratory located in Palo Alto, the Systems Research Center (SRC). Other Digital research groups are located in Paris (PRL) and in Cambridge,

Tradeoffs in Two-Level On-Chip Caching

by Norman P. Jouppi, Steven J. E. Wilton - In Proceedings of the 21st Annual International Symposium on Computer Architecture , 1993
"... The performance of two-level on-chip caching is investigated for a range of technology and architecture assumptions. The area and access time of each level of cache is modeled in detail. The results indicate that for most workloads, twolevel cache configurations (with a set-associative second level) ..."
Abstract - Cited by 94 (4 self) - Add to MetaCart
The performance of two-level on-chip caching is investigated for a range of technology and architecture assumptions. The area and access time of each level of cache is modeled in detail. The results indicate that for most workloads, twolevel cache configurations (with a set-associative second level) perform marginally better than single-level cache configurations that require the same chip area once the first-level cache sizes are 64KB or larger. Two-level configurations become even more important in systems with no off-chip cache and in systems in which the memory cells in the first-level caches are multiported and hence larger than those in the second-level cache. Finally, a new replacement policy called two-level exclusive caching is introduced. Two-level exclusive caching improves the performance of two-level caching organizations by increasing the effective associativity and capacity. d i g i t a l Western Research Laboratory 250 University Avenue Palo Alto, California 94301 USA...

Systems for Late Code Modification

by David W. Wall - WRL Research Report 91/5 , 1991
"... Modifying code after the compiler has generated it can be useful for both optimization and instrumentation. This paper compares the code modification systems of Mahler and pixie, and describes two new systems we have built that are hybrids of the two. This paper covers material presented at the CODE ..."
Abstract - Cited by 88 (5 self) - Add to MetaCart
Modifying code after the compiler has generated it can be useful for both optimization and instrumentation. This paper compares the code modification systems of Mahler and pixie, and describes two new systems we have built that are hybrids of the two. This paper covers material presented at the CODE '91 International Workshop on Code Generation, Schloss Dagstuhl, Germany, May 20-24, 1991. i 1. Introduction Late code modification is the process of modifying the output of a compiler after the compiler has generated it. The reasons one might want to do this fall into two categories, optimization and instrumentation. Some forms of optimization must be performed on assembly-level or machinelevel code. The oldest is peephole optimization [11], which acts to tidy up code that a compiler has generated; it has since been generalized to include transformations on more machine-independent code [2,3]. Reordering of code to avoid pipeline stalls [4,7,18] is most often done after the code is gene...

Experience with a Software-Defined Machine Architecture

by David W. Wall - Unreachable Procedures in Object-oriented WRL Research Report 91/10 , 1991
"... We built a system in which the compiler back end and the linker work together to present an abstract machine at a considerably higher level than the actual machine. The intermediate language translated by the back end is the target language of all high-level compilers and is also the only assembl ..."
Abstract - Cited by 53 (7 self) - Add to MetaCart
We built a system in which the compiler back end and the linker work together to present an abstract machine at a considerably higher level than the actual machine. The intermediate language translated by the back end is the target language of all high-level compilers and is also the only assembly language generally available. This lets us do intermodule register allocation, which would be harder if some of the code in the program had come from a traditional assembler, out of sight of the optimizer. We do intermodule register allocation and pipeline instruction scheduling at link time, using information gathered by the compiler back end. The mechanism for analyzing and modifying the program at link time was also useful in a wide array of instrumentation tools. i 1. Introduction When our lab built its experimental RISC workstation, the Titan, we defined a high-level assembly language as the official interface to the machine. This high-level assembly language, called Mahler,...

Operating system support for busy internet servers

by Jeffrey C. Mogul - In Proceedings of the Fifth Workshop on Hot Topics in Operating Systems (HotOS-V), Orcas Island , 1995
"... mogul @ wrl.dec.com The Internet has experienced exponential growth in the use of the World-Wide Web, and rapid growth in the use of other Internet services such as VSENET news and electronic mail. These applications qualitatively differ from other network applications in the stresses they impose on ..."
Abstract - Cited by 50 (2 self) - Add to MetaCart
mogul @ wrl.dec.com The Internet has experienced exponential growth in the use of the World-Wide Web, and rapid growth in the use of other Internet services such as VSENET news and electronic mail. These applications qualitatively differ from other network applications in the stresses they impose on busy server systems. Unlike traditional distributed systems, Internet servers must cope with huge user communities, short interactions, and long network latencies. Such servers require different kinds of operating system features to manage their resources effectively. 1

Fluoroelastomer Pressure Pad Design for Microelectronic Applications

by Alberto Makino, William R. Hamburgen, John S. Fitch , 1993
"... The elastic properties of gum rubber and fluoroelastomers were studied by a variety of numerical and experimental methods. Results were applied to the design of flat pressure pads for microelectronic applications. The goal was to develop an understanding sufficient that designers could quickly devel ..."
Abstract - Cited by 26 (1 self) - Add to MetaCart
The elastic properties of gum rubber and fluoroelastomers were studied by a variety of numerical and experimental methods. Results were applied to the design of flat pressure pads for microelectronic applications. The goal was to develop an understanding sufficient that designers could quickly develop acceptable fluoroelastomer pressure pads without further detailed studies. The effort centered on optimizing the performance of a 14 mm square by 0.8 mm thick pad under a fixed normal force. The primary optimization criterion was minimization of the maximum normal contact stresses applied by the pad to a rigid surface. Judicious perforation of flat pads greatly reduced adverse contact stress gradients. The preferred design used four 1.2 mm holes symmetrically arrayed in a 4 mm square grid centered on the pad. Compared to an unperforated pad, this arrangement yielded a 28% reduction in maximum contact stresses. i ii Fluoroelastomer Pressure Pad Design for Microelectronic Applications ...

Unreachable Procedures in Object-oriented Programming

by Amitabh Srivastava - ACM Letters on Programming Languages and Systems , 1993
"... Unreachable procedures are procedures that can never be invoked. Their existence may adversely affect the performance of a program. Unfortunately, their detection requires the entire program to be present. Using a link-time code modification system, we analyze large linked program modules of C++, C ..."
Abstract - Cited by 25 (4 self) - Add to MetaCart
Unreachable procedures are procedures that can never be invoked. Their existence may adversely affect the performance of a program. Unfortunately, their detection requires the entire program to be present. Using a link-time code modification system, we analyze large linked program modules of C++, C and Fortran. We find that C++ programs using objectoriented programming style contain a large fraction of unreachable procedure code. In contrast, C and Fortran programs have a low and essentially constant fraction of unreachable code. In this paper, we present our analysis of C++, C and Fortran programs, and discuss how object-oriented programming style generates unreachable procedures. This paper will appear in the ACM LOPLAS Vol 1, #4.. It replaces Technical Note TN-21, an earlier version of the same material. i 1 Introduction Unreachable procedures unnecessarily bloat an executable, making it require more disk space and decreasing its locality, which may affect its cache and paging be...

Link-Time Code Modification

by David W. Wall - DEC Western Research Lab , 1989
"... Many existing or potential programming tools require the program to be completely recompiled with a special compiler option. This is usually inconvenient for the program developer, and may reduce the usefulness of the tool or the frequency with which the tool is employed. It may also require the mai ..."
Abstract - Cited by 22 (4 self) - Add to MetaCart
Many existing or potential programming tools require the program to be completely recompiled with a special compiler option. This is usually inconvenient for the program developer, and may reduce the usefulness of the tool or the frequency with which the tool is employed. It may also require the maintenance of different versions of standard libraries, each compiled with the appropriate options for a different tool. The difference between modules compiled with and without the special option is often simple and regular. If so, we can effect this difference by modifying the normally-compiled object code at link time, instead of recompiling. This reduces the overhead of using the tool by an order of magnitude, making it much more convenient. i 1. Introduction Recompiling an entire multi-module program from scratch is usually so expensive o that one does it only reluctantly. In spite of this, many useful tools for program ptimization or performance analysis require the recompilation of...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University