Results 1  10
of
19
Asynchronous Design Methodologies: An Overview
 PROCEEDINGS OF THE IEEE
, 1995
"... Asynchronous design has been an active area of research since at least the mid 1950's, but has yet to achieve widespread use. We examine the benefits and problems inherent in asynchronous computations, and in some of the more notable design methodologies. These include Huffman asynchronous circui ..."
Abstract

Cited by 163 (0 self)
 Add to MetaCart
Asynchronous design has been an active area of research since at least the mid 1950's, but has yet to achieve widespread use. We examine the benefits and problems inherent in asynchronous computations, and in some of the more notable design methodologies. These include Huffman asynchronous circuits, burstmode circuits, micropipelines, templatebased and trace theorybased delayinsensitive circuits, signal transition graphs, change diagrams, and compilationbased quasidelayinsensitive circuits.
An Algorithm for Exact Bounds on the Time Separation of Events in Concurrent Systems
 IEEE Transactions on Computers
, 1993
"... Determining the time separation of events is a fundamental problem in the analysis, synthesis, and optimization of concurrent systems. Applications range from logic optimization of asynchronous digital circuits to evaluation of execution times of programs for realtime systems. We present an efficie ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
Determining the time separation of events is a fundamental problem in the analysis, synthesis, and optimization of concurrent systems. Applications range from logic optimization of asynchronous digital circuits to evaluation of execution times of programs for realtime systems. We present an efficient algorithm to find exact (tight) bounds on the separation time of events in an arbitrary process graph without conditional behavior. This result is more general than the methods presented in several previously published papers as it handles cyclic graphs and yields the tightest possible bounds on event separations. The algorithm is based on a functional decomposition technique that permits the implicit evaluation of an infinitely unfolded process graph. Examples are presented that demonstrate the utility and efficiency of the solution. The algorithm will form a basis for exploration of timingconstrained synthesis techniques. Index terms: Abstract algebra, asynchronous systems, concurrent ...
Verification of timed systems using POSETS
 In International Conference on Computer Aided Verification
, 1998
"... Abstract. This paper presents a new algorithm for efficiently verifying timed systems. The new algorithm represents timing information using geometric regions and explores the timed state space by considering partially ordered sets of events rather than linear sequences. This approach avoids the exp ..."
Abstract

Cited by 35 (11 self)
 Add to MetaCart
Abstract. This paper presents a new algorithm for efficiently verifying timed systems. The new algorithm represents timing information using geometric regions and explores the timed state space by considering partially ordered sets of events rather than linear sequences. This approach avoids the explosion of timed states typical of highly concurrent systems by dramatically reducing the ratio of timed states to untimed states in a system. A general class of timed systems which include both event and level causality can be specified and verified. This algorithm is applied to several recent timed benchmarks showing orders of magnitude improvement in runtime and memory usage. 1
Implementing a STARI Chip
"... STARI is a highspeed signaling technique that uses both synchronous and selftimed circuits. To demonstrate STARI, a chip has been fabricated using the MOSIS 2µ CMOS process. In a simple test xture, it operates at data rates of 120 Mbits/sec over a pair of wires. Because STARI uses both synchronous ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
STARI is a highspeed signaling technique that uses both synchronous and selftimed circuits. To demonstrate STARI, a chip has been fabricated using the MOSIS 2µ CMOS process. In a simple test xture, it operates at data rates of 120 Mbits/sec over a pair of wires. Because STARI uses both synchronous and selftimed circuits, it provides an opportunity to compare these two design methods. The synchronous circuits of the STARI chip achieve rates of operation two to three times those of the selftimed circuits. However, the selftimed FIFO in the receiver provides robust compensation for clock skew that could not be achieved with synchronous circuitry alone. Thus, the STARI chip demonstrates advantages of combining these two design techniques.
Efficient SelfTimed Interfaces for Crossing Clock Domains
 In Proceedings. 9th International Symposium on Asynchronous Circuits and Systems
, 2003
"... With increasing integration densities, large chip designs are commonly partitioned into multiple clock domains. While the computation within each individual domain may be synchronous, the interfaces between these domains often use asynchronous methods. One such approach is the STARI technique[12, 13 ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
With increasing integration densities, large chip designs are commonly partitioned into multiple clock domains. While the computation within each individual domain may be synchronous, the interfaces between these domains often use asynchronous methods. One such approach is the STARI technique[12, 13] where a selftimed FIFO compensates for clockskew between the sender and receiver. We present implementations of STARI where the FIFO consists of a single, handshaking stage. We start with the simplest case where the sender and receiver operate at exactly the same frequency with an unknown skew. We then generalize this design for links with clocks whose frequencies are rational multiples of each other, clocks whose frequencies are closely matched, and arbitrary clocks. We show that in each of these cases, the STARI interface can exploit the stability of typical clocks to achieve low latencies and negligible probabilities of synchronization failure using very simple hardware.
A Minimal SourceSynchronous Interface
 IN PROCEEDINGS OF THE 15TH IEEE ASIC/SOC CONFERENCE
, 2002
"... We present a novel implementation of source synchronous communication. Our design appears to the designer as a latch with two clock inputs, one from the transmitter and the other from the receiver. Our circuit is simple and provides a skew tolerance of nearly two clock periods. The analog dynamics o ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
We present a novel implementation of source synchronous communication. Our design appears to the designer as a latch with two clock inputs, one from the transmitter and the other from the receiver. Our circuit is simple and provides a skew tolerance of nearly two clock periods. The analog dynamics of our circuit provide a simple initialization mechanism that maximizes the robustness of the interface to skew variations.
Timed Circuit Verification Using TEL Structures
 IEEE Transactions on ComputerAided Design of Integrated Circuits
, 2001
"... Abstract—Recent design examples have shown that significant performance gains are realized when circuit designers are allowed to make aggressive timing assumptions. Circuit correctness in these aggressive styles is highly timing dependent and, in industry, they are typically designed by hand. In ord ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
Abstract—Recent design examples have shown that significant performance gains are realized when circuit designers are allowed to make aggressive timing assumptions. Circuit correctness in these aggressive styles is highly timing dependent and, in industry, they are typically designed by hand. In order to automate the process of designing and verifying timed circuits, algorithms for their synthesis and verification are necessary. This paper presents timed event/level (TEL) structures, a specification formalism for timed circuits that corresponds directly to gatelevel circuits. It also presents an algorithm based on partially ordered sets to make the statespace exploration of TEL structures more tractable. The combination of the new specification method and algorithm significantly improves efficiency for gatelevel timing verification. Results on a number of circuits, including many from the recently published gigahertz unit Test Site (guTS) processor from IBM indicate that modules of significant size can be verified using a level of abstraction that preserves the interesting timing properties of the circuit. Accurate circuit level verification allows the designer to include less margin in the design, which can lead to increased performance. I.
STARI: A Case Study in Compositional and Hierarchical Timing Verification
 in O. Grumberg (Ed.) Proc. CAV'97, 191201, LNCS 1254
, 1997
"... . In [TAKB96], we investigated techniques for checking if one realtime system correctly implements another and developed theory for hierarchical proofs and assumeguarantee style reasoning. In this study, using the techniques of [TAKB96], we verify the correctness of the timing of the communication ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
. In [TAKB96], we investigated techniques for checking if one realtime system correctly implements another and developed theory for hierarchical proofs and assumeguarantee style reasoning. In this study, using the techniques of [TAKB96], we verify the correctness of the timing of the communication chip STARI. 1 Introduction We describe the application of the techniques and tools described in [TAKB96] to the verification of the highbandwidth communication chip, STARI [G93]. STARI (by Greenstreet, [G93, G96]) is a selftimed FIFO that interfaces a transmitter and a receiver that operate at the same clock frequency but may have some skew between their clock signals (Figure 1). STARI can compensate for large, time varying skews and makes high bandwidth synchronous operation possible by eliminating the need for handshakes between the transmitter and the receiver. However, because there are no handshakes, certain timing properties need to be verified to show that the interface functions...
Simulation of PRAM Models on Meshes
 Nordic Journal on Computing, 2(1):51
, 1994
"... We analyze the complexity of simulating a PRAM (parallel random access machine) on a mesh structured distributed memory machine. By utilizing suitable algorithms for randomized hashing, routing in a mesh, and sorting in a mesh, we prove that simulation of a PRAM on p N \Theta p N (or 3 p N \The ..."
Abstract

Cited by 14 (9 self)
 Add to MetaCart
We analyze the complexity of simulating a PRAM (parallel random access machine) on a mesh structured distributed memory machine. By utilizing suitable algorithms for randomized hashing, routing in a mesh, and sorting in a mesh, we prove that simulation of a PRAM on p N \Theta p N (or 3 p N \Theta 3 p N \Theta 3 p N ) mesh is possible with O( p N ) (respectively O( 3 p N )) delay with high probability and a relatively small constant. Furthermore, with more sophisticated simulations further speedups are achieved; experiments show delays as low as p N + o( p N ) (respectively 3 p N + o( 3 p N )) per N PRAM processors. These simulations compare quite favorably with PRAM simulations on butterfly and hypercube. 1 Introduction PRAM 1 (Parallel Random Access Machine) is an abstract model of computation. It consists of N processors, each of which may have some local memory and registers, and a global shared memory of size m. A step of a PRAM is often seen to consist of...
Practical Applications of an Efficient Time Separation of Events Algorithm
 In Proc. International Conf. ComputerAided Design (ICCAD
"... Determining the time separation of events is a fundamental problem in the analysis, synthesis, and optimization of concurrent systems. We present results of applying an efficient algorithm to solve this problem to three different application domains. These are: analysis of instruction execution time ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
Determining the time separation of events is a fundamental problem in the analysis, synthesis, and optimization of concurrent systems. We present results of applying an efficient algorithm to solve this problem to three different application domains. These are: analysis of instruction execution times of an asynchronous microprocessor, analysis of a highperformance mixed asynchronous/synchronous communication interface, and isochronic fork analysis in asynchronous circuit synthesis. The algorithm we use yields exact (tight) bounds on the separation time of events in an arbitrary process graph without conditional behavior. This class of graphs is quite large and includes graphs that are not strongly connected. The algorithm is based on a functional decomposition technique that permits the implicit evaluation of an infinitely unfolded process graph. 1 Introduction Event based specifications naturally model communication and concurrency and are thus a popular model for a wide range of co...