Results 1 - 10
of
77
MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems
"... Over the last decade, significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been conducted in the context of generalpurpose computing, and more specifically the SPEC benchmark suite. At ..."
Abstract
-
Cited by 748 (19 self)
- Add to MetaCart
Over the last decade, significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been conducted in the context of generalpurpose computing, and more specifically the SPEC benchmark suite. At the same time, a number of microprocessor architectures have emerged which have VLIW and SIMD structures that are well matched to the needs of the ILP compilers. Most of these processors are targeted at embedded applications such as multimedia and communications, rather than general-purpose systems. Conventional wisdom, and a history of hand optimization of inner-loops, suggests that ILP compilation techniques are well suited to these applications. Unfortunately, there currently exists a gap between the compiler community and embedded applications developers. This paper presents MediaBench, a benchmark suite that has been designed to fill this gap. This suite has been constructed through a three-step process: intuition and market driven initial selection, experimental measurement to establish uniqueness, and integration with system synthesis algorithms to establish usefulness.
Lottery Scheduling: Flexible Proportional-Share Resource Management
, 1994
"... This paper presents lottery scheduling, a novel randomized resource allocation mechanism. Lottery scheduling provides efficient, responsive control over the relative execution rates of computations. Such control is beyond the capabilities of conventional schedulers, and is desirable in systems that ..."
Abstract
-
Cited by 374 (4 self)
- Add to MetaCart
This paper presents lottery scheduling, a novel randomized resource allocation mechanism. Lottery scheduling provides efficient, responsive control over the relative execution rates of computations. Such control is beyond the capabilities of conventional schedulers, and is desirable in systems that service requests of varying importance, such as databases, media-based applications, and networks. Lottery scheduling also supports modular resource management by enabling concurrent modules to insulate their resource allocation policies from one another. A currency abstraction is introduced to flexibly name, share, and protect resource rights. We also show that lottery scheduling can be generalized to manage many diverse resources, such as I/O bandwidth, memory, and access to locks. We have implemented a prototype lottery scheduler for the Mach 3.0 microkernel, and found that it provides flexible and responsive control over the relative execution rates of a wide range of applications. The overhead imposed by our unoptimized prototype is comparable to that of the standard Mach timesharing policy.
Lottery and Stride Scheduling: Flexible Proportional-Share Resource Management
-
, 1995
"... This thesis presents flexible abstractions for specifying resource management policies, together with efficient mechanisms for implementing those abstractions. Several novel scheduling techniques are introduced, including both randomized and deterministic algorithms that provide proportional-share c ..."
Abstract
-
Cited by 129 (4 self)
- Add to MetaCart
This thesis presents flexible abstractions for specifying resource management policies, together with efficient mechanisms for implementing those abstractions. Several novel scheduling techniques are introduced, including both randomized and deterministic algorithms that provide proportional-share control over resource consumption rates. Such control is beyond the capabilities of conventional schedulers, and is desirable across a broad spectrum of systems that service clients of varying importance. Proportional-share scheduling is examined for several diverse resources, including processor time, memory, access to locks, and disk bandwidth. Resource rights are encapsulated by abstract, first-class objects called tickets. An active client consumes resources at a rate proportional to the number of tickets that it holds. Tickets can be issued in different amounts and may be transferred between clients. A modular currency abstraction is also introduced to flexibly name, share, and protect ...
CommBench - A Telecommunications Benchmark for Network Processors
- IN PROC. OF IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS
, 2000
"... This paper presents a benchmark, CommBench, for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. The benchmark ..."
Abstract
-
Cited by 102 (17 self)
- Add to MetaCart
This paper presents a benchmark, CommBench, for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. The benchmark
Quantifying behavioral differences between C and C++ programs
- JOURNAL OF PROGRAMMING LANGUAGES
, 1994
"... Improving the performance of C programs has been a topic of great interest for many years. Both hardware technology and compiler optimization research has been applied in an effort to make C programs execute faster. In many application domains, the C++ language is replacing C as the programming lang ..."
Abstract
-
Cited by 83 (15 self)
- Add to MetaCart
Improving the performance of C programs has been a topic of great interest for many years. Both hardware technology and compiler optimization research has been applied in an effort to make C programs execute faster. In many application domains, the C++ language is replacing C as the programming language of choice. In this paper, we measure the empirical behavior of a group of significant C and C++ programs and attempt to identify and quantify behavioral differences between them. Our goal is to determine whether optimization technology that has been successful for C programs will also be successful in C++ programs. We furthermore identify behavioral characteristics of C++ programs that suggest optimizations that should be applied in those programs. Our results show that C++ programs exhibit behavior that is significantly different than C programs. These results should be of interest to compiler writers and architecture designers who are designing systems to execute object-oriented programs.
HLS: Combining Statistical and Symbolic Simulation to Guide Microprocessor Designs
- IN PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE
, 2000
"... As microprocessors continue to evolve, many optimizations reach a point of diminishing returns. We introduce HLS, a hybrid processor simulator which uses statistical models and symbolic execution to evaluate design alternatives. This simulation methodology allows for quick and accurate contour maps ..."
Abstract
-
Cited by 73 (0 self)
- Add to MetaCart
As microprocessors continue to evolve, many optimizations reach a point of diminishing returns. We introduce HLS, a hybrid processor simulator which uses statistical models and symbolic execution to evaluate design alternatives. This simulation methodology allows for quick and accurate contour maps to be generated of the performance space spanned by design parameters. We validate the accuracy of HLS through correlation with existing cycle-by-cycle simulation techniques and current generation hardware. We demonstrate the power of HLS by exploring design spaces de ned by two parameters: code properties and value prediction. These examples motivate how HLS can be used to set design goals and individual component performance targets.
Efficient Memory Simulation in SimICS
, 1995
"... We describe novel techniques used for efficient simulation of memory in SimICS, an instruction level simulator developed at SICS. The design has focused on efficiently supporting the simulation of multiprocessors, analyzing complex memory hierarchies and running large binaries with a mixture of syst ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
We describe novel techniques used for efficient simulation of memory in SimICS, an instruction level simulator developed at SICS. The design has focused on efficiently supporting the simulation of multiprocessors, analyzing complex memory hierarchies and running large binaries with a mixture of system-level and user-level code. A software caching
The Computational and Storage Potential of Volunteer Computing
, 2006
"... Abstract – “Volunteer computing ” uses Internetconnected computers, volunteered by their owners, as a source of computing power and storage. This paper studies the potential capacity of volunteer computing. We analyzed measurements of about 200,000 hosts participating in a volunteer computing projec ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
Abstract – “Volunteer computing ” uses Internetconnected computers, volunteered by their owners, as a source of computing power and storage. This paper studies the potential capacity of volunteer computing. We analyzed measurements of about 200,000 hosts participating in a volunteer computing project. These measurements include processing power, memory, disk space, network throughput, host availability, userspecified limits on resource usage, and host churn. We show that volunteer computing can support applications that are significantly more data-intensive, or have high memory and storage requirements, than those in current projects. 1
Isolation with Flexibility: A Resource Management Framework for Central Servers
- In Proceedings of the 2000 USENIX Annual Technical Conference
, 2000
"... Proportional-share resource management is becoming increasingly important in today's computing environments. In particular, the growing use of the computational resources of central service providers argues for a proportional-share approach that allows resource principals to obtain allocations that ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
Proportional-share resource management is becoming increasingly important in today's computing environments. In particular, the growing use of the computational resources of central service providers argues for a proportional-share approach that allows resource principals to obtain allocations that reflect their relative importance. In such environments, resource principals must be isolated from one another to prevent the activities of one principal from impinging on the resource rights of others. However, such isolation limits the flexibility with which resource allocations can be modified to reflect the actual needs of applications. We present extensions to the lottery-scheduling resource management framework that increase its flexibility while preserving its ability to provide secure isolation. To demonstrate how this extended framework safely overcomes the limits imposed by existing proportional-share schemes, we have implemented a prototype system that uses the framework to manage...
Filaments: Efficient Support for Fine-Grain Parallelism
, 1993
"... . It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and synchronization. On the other hand, there are several advantages to fine-grain parallelism: architecture independence, ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and synchronization. On the other hand, there are several advantages to fine-grain parallelism: architecture independence, ease of programming, ease of use as a target for code generation, and load-balancing potential. This paper describes a portable threads package, Filaments, that supports efficient execution of fine-grain parallel programs on shared-memory multiprocessors. Filaments supports three kinds of threads---run-to-completion, barrier (iterative), and fork/join--- which appear to be sufficient for scientific computations. Filaments employs a unique combination of techniques to achieve efficiency: stateless threads, very small thread descriptors, optimized barrier synchronization, scheduling that enhances data locality, and automatic pruning of fork/join threads. The gains in performance are such that ...

