Results 1 - 10
of
11,960
Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors
- In Proceedings of the 17th Annual International Symposium on Computer Architecture
, 1990
"... Scalable shared-memory multiprocessors distribute memory among the processors and use scalable interconnection networks to provide high bandwidth and low latency communication. In addition, memory accesses are cached, buffered, and pipelined to bridge the gap between the slow shared memory and the f ..."
Abstract
-
Cited by 730 (17 self)
- Add to MetaCart
and the fast processors. Unless carefully controlled, such architectural optimizations can cause memory accesses to be executed in an order different from what the programmer expects. The set of allowable memory access orderings forms the memory consistency model or event ordering model for an architecture.
Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing
- IEEE TRANSACTIONS ON COMPUTERS
, 1987
"... Large grain data flow (LGDF) programming is natural and convenient for describing digital signal processing (DSP) systems, but its runtime overhead is costly in real time or cost-sensitive applications. In some situations, designers are not willing to squander computing resources for the sake of pro ..."
Abstract
-
Cited by 598 (37 self)
- Add to MetaCart
of program-mer convenience. This is particularly true when the target machine is a programmable DSP chip. However, the runtime overhead inherent in most LGDF implementations is not required for most signal processing systems because such systems are mostly synchronous (in the DSP sense). Synchronous data
SIS: A System for Sequential Circuit Synthesis
, 1992
"... SIS is an interactive tool for synthesis and optimization of sequential circuits. Given a state transition table, a signal transition graph, or a logic-level description of a sequential circuit, it produces an optimized net-list in the target technology while preserving the sequential input-output b ..."
Abstract
-
Cited by 527 (44 self)
- Add to MetaCart
-output behavior. Many different programs and algorithms have been integrated into SIS, allowing the user to choose among a variety of techniques at each stage of the process. It is built on top of MISII [5] and includes all (combinational) optimization techniques therein as well as many enhancements. SIS serves
Model checking and abstraction
- ACM Transactions on Programming Languages and Systems
, 1994
"... software developers are using the Java language as the language of choice on many applications. This is due to the effective use of the object-oriented (OO) paradigm to develop large software projects and the ability of the Java language to support the increasing use of web technologies in business ..."
Abstract
-
Cited by 742 (55 self)
- Add to MetaCart
applications. The recent release of the Java version 5.0 has further increased its popularity due to the inclusion of new features that exist in other OO languages. The transition from Java 1.4.x to Java 1.5.x has provided the programmer with more flexibility when implementing programs in Java. In this paper
Shared memory consistency models: A tutorial
- IEEE Computer
, 1996
"... Parallel systems that support the shared memory abstraction are becoming widely accepted in many areas of computing. Writing correct and efficient programs for such systems requires a formal specification of memory semantics, called a memory consistency model. The most intuitive model—sequential con ..."
Abstract
-
Cited by 441 (10 self)
- Add to MetaCart
optimizations they allow. We retain the system-centric emphasis, but use uniform and simple terminology to describe the different models. We also briefly discuss an alternate programmer-centric view that describes the models in terms of program behavior rather than specific system optimizations. 1
Architectural Support for Quality of Service for CORBA Objects
, 1997
"... this paper we discuss four major problems we have observed in our developing and deploying wide-area distributed object applications and middleware. First, most programs are developed ignoring the variable wide area conditions. Second, when application programmers do try to handle these conditions, ..."
Abstract
-
Cited by 370 (35 self)
- Add to MetaCart
this paper we discuss four major problems we have observed in our developing and deploying wide-area distributed object applications and middleware. First, most programs are developed ignoring the variable wide area conditions. Second, when application programmers do try to handle these conditions
The Stanford FLASH multiprocessor
- In Proceedings of the 21st International Symposium on Computer Architecture
, 1994
"... The FLASH multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance message passing, while minimizing both hardware and software overhead. Each node in FLASH contains a microprocessor, a portion of the machine’s global memory, a port to the interconnection n ..."
Abstract
-
Cited by 349 (20 self)
- Add to MetaCart
network, an I/O interface, and a custom node controller called MAGIC. The MAGIC chip handles all communication both within the node and among nodes, using hardwired data paths for efficient data movement and a programmable processor optimized for executing protocol operations. The use of the protocol
Orca: A language for parallel programming of distributed systems
- IEEE Transactions on Software Engineering
, 1992
"... Orca is a language for implementing parallel applications on loosely coupled distributed systems. Unlike most languages for distributed programming, it allows processes on different machines to share data. Such data are encapsulated in data-objects, which are instances of user-defined abstract data ..."
Abstract
-
Cited by 332 (46 self)
- Add to MetaCart
Orca is a language for implementing parallel applications on loosely coupled distributed systems. Unlike most languages for distributed programming, it allows processes on different machines to share data. Such data are encapsulated in data-objects, which are instances of user-defined abstract data
Linear Algebra Operators for GPU Implementation of Numerical Algorithms
- ACM Transactions on Graphics
, 2003
"... In this work, the emphasis is on the development of strategies to realize techniques of numerical computing on the graphics chip. In particular, the focus is on the acceleration of techniques for solving sets of algebraic equations as they occur in numerical simulation. We introduce a framework for ..."
Abstract
-
Cited by 324 (9 self)
- Add to MetaCart
for the implementation of linear algebra operators on programmable graphics processors (GPUs), thus providing the building blocks for the design of more complex numerical algorithms. In particular, we propose a stream model for arithmetic operations on vectors and matrices that exploits the intrinsic parallelism
Performance Debugging for Distributed Systems of Black Boxes
, 2003
"... Many interesting large-scale systems are distributed systems of multiple communicating components. Such systems can be very hard to debug, especially when they exhibit poor performance. The problem becomes much harder when systems are composed of "black-box" components: software from many ..."
Abstract
-
Cited by 316 (3 self)
- Add to MetaCart
different (perhaps competing) vendors, usually without source code available. Typical solutions-provider employees are not always skilled or experienced enough to debug these systems efficiently. Our goal is to design tools that enable modestly-skilled programmers (and experts, too) to isolate performance
Results 1 - 10
of
11,960