Results 1 - 10
of
4,572
MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems
"... Over the last decade, significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been conducted in the context of generalpurpose computing, and more specifically the SPEC benchmark suite. At ..."
Abstract
-
Cited by 966 (22 self)
- Add to MetaCart
. Conventional wisdom, and a history of hand optimization of inner-loops, suggests that ILP compilation techniques are well suited to these applications. Unfortunately, there currently exists a gap between the compiler community and embedded applications developers. This paper presents MediaBench, a benchmark
The design and implementation of FFTW3
- PROCEEDINGS OF THE IEEE
, 2005
"... FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our cu ..."
Abstract
-
Cited by 726 (3 self)
- Add to MetaCart
current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific single-instruction, multiple-data (SIMD) instructions, and how a special-purpose compiler can derive optimized implementations
Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing
- IEEE TRANSACTIONS ON COMPUTERS
, 1987
"... Large grain data flow (LGDF) programming is natural and convenient for describing digital signal processing (DSP) systems, but its runtime overhead is costly in real time or cost-sensitive applications. In some situations, designers are not willing to squander computing resources for the sake of pro ..."
Abstract
-
Cited by 598 (37 self)
- Add to MetaCart
not be done at runtime, but can be done at compile time (statically), so the runtime overhead evaporates. The sample rates can all be different, which is not true of most current data-driven digital signal processing programming methodologies. Synchronous data flow is closely related to computation graphs, a
CIL: Intermediate language and tools for analysis and transformation of C programs
- In International Conference on Compiler Construction
, 2002
"... Abstract. This paper describes the CIntermediate Language: a highlevel representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs. Compared to C, CIL has fewer constructs. It breaks down certain complicated constructs of C into simpler ones, ..."
Abstract
-
Cited by 533 (11 self)
- Add to MetaCart
, and thus it works at a lower level than abstract-syntax trees. But CIL is also more high-level than typical intermediate languages (e.g., three-address code) designed for compilation. As a result, what we have is a representation that makes it easy to analyze and manipulate C programs, and emit them in a
MULTILISP: a language for concurrent symbolic computation
- ACM Transactions on Programming Languages and Systems
, 1985
"... Multilisp is a version of the Lisp dialect Scheme extended with constructs for parallel execution. Like Scheme, Multilisp is oriented toward symbolic computation. Unlike some parallel programming languages, Multilisp incorporates constructs for causing side effects and for explicitly introducing par ..."
Abstract
-
Cited by 529 (1 self)
- Add to MetaCart
understandable programs. Multilisp is being implemented on the 32-processor Concert multiprocessor; however, it is ulti-mately intended for use on larger multiprocessors. The current implementation, called Concert Multilisp, is complete enough to run the Multilisp compiler itself and has been run on Concert
Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism
- ACM Transactions on Computer Systems
, 1992
"... Threads are the vehicle,for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expr ..."
Abstract
-
Cited by 475 (21 self)
- Add to MetaCart
the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead. Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has
Shared memory consistency models: A tutorial
- IEEE Computer
, 1996
"... Parallel systems that support the shared memory abstraction are becoming widely accepted in many areas of computing. Writing correct and efficient programs for such systems requires a formal specification of memory semantics, called a memory consistency model. The most intuitive model—sequential con ..."
Abstract
-
Cited by 441 (10 self)
- Add to MetaCart
—sequential consistency—greatly restricts the use of many performance optimizations commonly used by uniprocessor hardware and compiler designers, thereby reducing the benefit of using a multiprocessor. To alleviate this problem, many current multiprocessors support more relaxed consistency models. Unfortunately
The causes of corruption: a cross-national study
, 2000
"... Why is corruption — the misuse of public office for private gain — perceived to be more widespread in some countries than others? Different theories associate this with particular historical and cultural traditions, levels of economic development, political institutions, and government policies. Thi ..."
Abstract
-
Cited by 397 (2 self)
- Add to MetaCart
. This article analyzes several indexes of ‘perceived corruption’ compiled from business risk surveys for the 1980s and 1990s. Six arguments find support. Countries with Protestant traditions, histories of British rule, more developed economies, and (probably) higher imports were less ‘corrupt’. Federal states
Fixing the Java memory model
- In ACM Java Grande Conference
, 1999
"... This paper describes the new Java memory model, which has been revised as part of Java 5.0. The model specifies the legal behaviors for a multithreaded program; it defines the semantics of multithreaded Java programs and partially determines legal implementations of Java virtual machines and compile ..."
Abstract
-
Cited by 385 (10 self)
- Add to MetaCart
they rely on a strong notion of data and control dependences that precludes some standard compiler transformations. Although the majority of what is currently done in compilers is legal, the new model introduces significant differences, and clearly defines the boundaries of legal transformations
Shade: A Fast Instruction-Set Simulator for Execution Profiling
, 1994
"... Tracing tools are used widely to help analyze, design, and tune both hardware and software systems. This paper describes a tool called Shade which combines efficient instruction-set simulation with a flexible, extensible trace generation capability. Efficiency is achieved by dynamically compiling an ..."
Abstract
-
Cited by 383 (2 self)
- Add to MetaCart
Tracing tools are used widely to help analyze, design, and tune both hardware and software systems. This paper describes a tool called Shade which combines efficient instruction-set simulation with a flexible, extensible trace generation capability. Efficiency is achieved by dynamically compiling
Results 1 - 10
of
4,572