Results 1  10
of
22
An event spacing experiment
 Proceedings of the Eighth International Symposium on Asynchronous Circuits and Systems, ASYNC’02
, 2002
"... Events in selftimed rings can propagate evenly spaced or as bursts. By studying these phenomena, we obtain a better understanding of the underlying dynamics of selftimed pipelines, which is a necessary precursor to utilizing these dynamics to obtain higher performance (see, e.g., [18]). We show t ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
(Show Context)
Events in selftimed rings can propagate evenly spaced or as bursts. By studying these phenomena, we obtain a better understanding of the underlying dynamics of selftimed pipelines, which is a necessary precursor to utilizing these dynamics to obtain higher performance (see, e.g., [18]). We show that standard bounded delay models are inadequate to discriminate between bursting and evenly spaced behaviours and show that an extension of the Charlie Diagrams of [5] provides a framework for understanding these phenomena. This paper describes our novel analytical approaches and the design and fabrication of a chip to test our theoretical models.
The Role of BackPressure in Implementing LatencyInsensitive Systems
 Electronic Notes in Theoretical Computer Science
, 2006
"... Backpressure is a logical mechanism to control the flow of information on a communication channel of a latencyinsensitive system (LIS) while guaranteeing that no packet is lost. Backpressure is necessary for building open LISs and it represents an interesting design alternative also for closed LI ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
Backpressure is a logical mechanism to control the flow of information on a communication channel of a latencyinsensitive system (LIS) while guaranteeing that no packet is lost. Backpressure is necessary for building open LISs and it represents an interesting design alternative also for closed LISs because it makes possible to realize highly modular implementations with more predictable features in terms of design overhead (area, power). In discussing the role of backpressure, we revisit the logic of the necessary building blocks, and explain the impact of the system topology on the system performance.
Performance Analysis of Asynchronous Circuits and Systems using Stochastic Timed Petri Nets
 Hardware Design and Petri Nets
, 1999
"... . This paper describes and extends a recently developed approach for performance analysis of asynchronous circuits modeled with stochastic timed Petri nets (STPNs) with unique and freechoice places and arbitrary delay distributions. The approach analyzes finite STPN executions to derive closedfor ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
. This paper describes and extends a recently developed approach for performance analysis of asynchronous circuits modeled with stochastic timed Petri nets (STPNs) with unique and freechoice places and arbitrary delay distributions. The approach analyzes finite STPN executions to derive closedform expressions for lower and upper bounds on the performance estimates that can be efficiently evaluated using standard statistical methods. The mean of the derived upper and lower bounds thus provides an estimate of the performance metric which has a welldefined error interval. Moreover, we can often make the error interval arbitrarily small by analyzing longer STPN executions at the cost of additional runtime. Experiments on several asynchronous systems demonstrate the high quality of our estimates and the efficiency of the technique. The experiments include the performance analysis of a fullscale Petri net model of Intel's asynchronous instruction length decoding and steering unit RAPPID...
Temporal Properties of SelfTimed Rings
 In proceedings of CHARME 2001, Lecture Notes in Computer Science 2144
, 2001
"... Various researchers have proposed using selftimed networks to generate and distribute clocks and other timing signals. We consider one of the simplest selftimed networks, a ring, and note that for timing applications, selftimed rings should maintain uniform spacing of events. ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Various researchers have proposed using selftimed networks to generate and distribute clocks and other timing signals. We consider one of the simplest selftimed networks, a ring, and note that for timing applications, selftimed rings should maintain uniform spacing of events.
FLYSIG: Dataflow Oriented DelayInsensitive Processor for Rapid Prototyping of Signal Processing
 in: Proceedings of the Ninth International Workshop on Rapid System Prototyping
, 1998
"... : As the onechip integration of HWmodules designed by different companies becomes more and more popular reliability of a HWdesign and evaluation of the timing behavior during the prototype stage are absolutely necessary. One way to guarantee reliability is the use of robust design styles, e.g., d ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
: As the onechip integration of HWmodules designed by different companies becomes more and more popular reliability of a HWdesign and evaluation of the timing behavior during the prototype stage are absolutely necessary. One way to guarantee reliability is the use of robust design styles, e.g., delayinsensitivity. For early timing evaluation two aspects must be considered: a) The timing needs to be proportional to technology variations and b) the implemented architecture should be identical for prototype and target. The first can be met also by delayinsensitive implementation. The latter one is the key point. A unified architecture is needed for prototyping as well as implementation. Our new approach to rapid prototyping of signal processing tasks is based on a configurable, delayinsensitive implemented processor called FLYSIG 2 . In essence, the FLYSIG processor can be understood as a complex FPGA where the CLBs are substituted by bitserial operators. In this paper the genera...
Accelerating Markovian Analysis of Asynchronous Systems using Stringbased State Compression
 IEEE Transactions on ComputerAided Design
, 1998
"... This paper presents a methodology to speed up the stationary analysis of large Markov chains that model asynchronous systems. Instead of directly working on the original Markov chain, we propose to analyze a smaller Markov chain obtained via a novel technique called stringbased state compression. O ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
This paper presents a methodology to speed up the stationary analysis of large Markov chains that model asynchronous systems. Instead of directly working on the original Markov chain, we propose to analyze a smaller Markov chain obtained via a novel technique called stringbased state compression. Once the smaller chain is solved, the solution to the original chain is obtained via a process called expansion. The method is especially powerful when the Markov chain has a small feedback vertex set, which happens often in asynchronous systems. Experimental results show that the method can yield reductions of more than an order of magnitude in run time and facilitate the analysis of larger systems than possible using traditional techniques. 1 Introduction Driven by market demands for lowpower and highperformance, tools to estimate power and performance of a system have become particularly important. In an asynchronous system, the randomness caused by varying input data rate and data proce...
Comparison of tree and straightline clocking for long systolic arrays
 Journal of VLSI Signal Processing
, 1991
"... Abstract. A critical problem in building long systolic arrays lies in efficient and reliable synchronization. We address this problem in the context of synchronous systems by introducing probabilistic models for two alternative clock distribution schemes: tree and straightline clocking. We present ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract. A critical problem in building long systolic arrays lies in efficient and reliable synchronization. We address this problem in the context of synchronous systems by introducing probabilistic models for two alternative clock distribution schemes: tree and straightline clocking. We present analytic bounds for the Probability of Failure and the Mean Time to Failure, and examine the tradeoffs between reliability and throughput in both schemes. Our basic conclusion is that as the onedimensional systolic array gets very long, tree clocking becomes more reliable than straightline clocking. 1.
Symbolic Time Separation of Events
 In Proc. International Symposium on Advanced Research in Asynchronous Circuits and Systems
, 1999
"... We extend the TSE [14] timing analysis algorithm into the symbolic domain, that is, we allow symbolic variables to be used to specify unknown parameters of the model (essentially, unknown delays) and verification algorithms which are capable of identifying not just failure or success, but also the c ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We extend the TSE [14] timing analysis algorithm into the symbolic domain, that is, we allow symbolic variables to be used to specify unknown parameters of the model (essentially, unknown delays) and verification algorithms which are capable of identifying not just failure or success, but also the constraints on these symbolic variables which will ensure successful verification. The two main contributions are 1) an iterative algorithm which continuously narrows down the domain of interest and 2) a practical method for reducing the representation of symbolic expressions containing minimizations and maximizations defined for a given domain. We report experimental results for several asynchronous circuits to demonstrate that symbolic analysis is feasible and that the output provided is what a designer (or perhaps a synthesis tool) would often want to know. 1. Introduction This paper presents a novel approach to timing analysis based on a new paradigm we refer to as "symbolic timing verif...
Performance Estimation and Slack Matching for Pipelined Asynchronous Architectures with Choice
"... Abstract — This paper presents a fast analytical method for estimating the throughput of pipelined asynchronous systems, and then applies that method to develop a fast solution to the problem of pipelining “slack matching. ” The approach targets systems with hierarchical topologies, which typically ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract — This paper presents a fast analytical method for estimating the throughput of pipelined asynchronous systems, and then applies that method to develop a fast solution to the problem of pipelining “slack matching. ” The approach targets systems with hierarchical topologies, which typically result when highlevel (block structured) language specifications are compiled into datadriven circuit implementations. A significant contribution is that our approach is the first to efficiently handle architectures with choice (i.e., the presence of conditional computation constructs such ifthenelse and conditional loops). The key idea behind the fast speed of our analysis method is to exploit information about the hierarchy of a given blockstructured system, thereby yielding a runtime that is linear in the number of pipeline stages. In contrast, existing approaches typically represent an entire system as a single Petri net or marked graph, and then apply Markov chain analysis or other state enumeration methods with costly runtimes. Building upon our analysis approach, we introduce a novel solution to the problem of slack matching, i.e., determining optimal insertion of FIFO stages into a pipelined design to improve performance. We present both an optimal solution using an MILP formulation, and a fast heuristic algorithm that yielded optimal results for all of our examples. I.
A fast, asP*, RGD arbiter
 Proceedings of the Fifth International Symposium on Advanced Research on Asynchronous Circuits and Systems
, 1999
"... This paper presents the design of a highthroughput, lowlatency, asP*, RGD arbiter. Spice simulations for an implementation in a 0:8 CMOS process show a requesttogrant delay of 0:74ns and a donetograntdelay of 0:4ns. Maximum throughput of requests from a single client is one grant per 1:8ns; if ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This paper presents the design of a highthroughput, lowlatency, asP*, RGD arbiter. Spice simulations for an implementation in a 0:8 CMOS process show a requesttogrant delay of 0:74ns and a donetograntdelay of 0:4ns. Maximum throughput of requests from a single client is one grant per 1:8ns; if both clients make request aggressively, the arbiter can produce one grant per 1:2ns. In addition to presenting a highperformance design, this paper examines tradeoffs in performance driven design. In particular, logic delay seems to dominate metastability concerns when optimizing performance.