Results 1 
8 of
8
Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors
, 1981
"... In this paper we implement several basic operating system primitives by using a "replaceadd" operation, which can supersede the standard "test and set", and which appears to be a universal primitive for efficiently coordinating large numbers of independently acting sequential pr ..."
Abstract

Cited by 88 (2 self)
 Add to MetaCart
In this paper we implement several basic operating system primitives by using a "replaceadd" operation, which can supersede the standard "test and set", and which appears to be a universal primitive for efficiently coordinating large numbers of independently acting sequential processors. We also present a hardware implementation of replaceadd that permits multiple replaceadds to be processed nearly as efficiently as loads and stores. Moreover, the crucial special case of concurrent replaceadds updating the same variable is handled particularly well: If every PE simultaneously addresses a replaceadd at the same variable, all these requests are satisfied in the time required to process just one request.
Routers with a Single Stage of Buffering
, 2002
"... Most high performance routers today use combined input and output queueing (CIOQ). The CIOQ router is also frequently used as an abstract model for routers: at one extreme is input queueing, at the other extreme is output queueing, and inbetween there is a continuum of performance as the speedup is ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
Most high performance routers today use combined input and output queueing (CIOQ). The CIOQ router is also frequently used as an abstract model for routers: at one extreme is input queueing, at the other extreme is output queueing, and inbetween there is a continuum of performance as the speedup is increased from 1 to N (where N is the number of linecards). The model includes architectures in which a switch fabric is sandwiched between two stages of buffering. There is a rich and growing theory for CIOQ routers, including algorithms, throughput results and conditions under which delays can be guaranteed. But there is a broad class of architectures that are not captured by the CIOQ model, including routers with centralized shared memory, and loadbalanced routers. In this paper we propose an abstract model called SingleBuffered (SB) routers that includes these architectures. We describe a method called Constraint Sets to analyze a number of SB router architectures. The model helped identify previously unstudied architectures, in particular the Distributed Shared Memory router. Although commercially deployed, its performance is not widely known. We find conditions under which it can emulate an ideal shared memory router, and believe it to be a promising architecture. Questions remain about its complexity, but we find that the memory bandwidth, and potentially the power consumption of the router is lower than for a CIOQ router.
Routing Architecture and Layout Synthesis for MultiFPGA Systems
 UNIVERSITY OF TORONTO
, 1999
"... MultiFPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture, which is the manner in which wires, FPGAs and FieldProgrammable Interconnect Devices (FPIDs) are connected. Thi ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
MultiFPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture, which is the manner in which wires, FPGAs and FieldProgrammable Interconnect Devices (FPIDs) are connected. This
Analysis of a Recurrence Arising from a Construction for NonBlocking Networks
, 1993
"... : Define f on the integers n ? 1 by the recurrence f(n) = minfn; min mjn 2f(m) + 3f(n=m)g: The function f has f(n) = n as its upper envelope, attained for all prime n. Our goal in this paper is to determine the corresponding lower envelope. We shall show that this has the form f(n) C(log n) 1+1 ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
: Define f on the integers n ? 1 by the recurrence f(n) = minfn; min mjn 2f(m) + 3f(n=m)g: The function f has f(n) = n as its upper envelope, attained for all prime n. Our goal in this paper is to determine the corresponding lower envelope. We shall show that this has the form f(n) C(log n) 1+1=fl for certain constants fl and C, in the sense that for any " ? 0, the inequality f(n) (C + ")(log n) 1+1=fl holds for infinitely many n, while f(n) (C \Gamma ")(log n) 1+1=fl holds for only finitely many. In fact, fl = 0:7878 : : : is the unique real solution of the equation 2 \Gammafl + 3 \Gammafl = 1, and C = 1:5595 : : : is given by the expression C = fl (2 \Gammafl log 2 fl + 3 \Gammafl log 3 fl ) 1=fl (fl + 1) i 15 \Gammafl log fl+1 5 2 + 3 \Gammafl P 5k7 log fl+1 k+1 k + P 8k15 log fl+1 k+1 k j 1=fl : We also consider the function f 0 defined by replacing the integers n ? 1 with the reals x ? 1 in the above recurrence: f 0 (x) = minfx; inf 1...
A MATRIX DECOMPOSITION APPROACH TO THE CONTROL OF CLOS REARRANGEABLE SWITCHING NETWORKS
"... Abstract: Communication issues remain the key for the development of Distributed Memory MIMD cornputers. Two main approaches prevail in the search for adequate communication paradigms: the use of static or reconfigurable interconnection networks. In this paper, we are interested in the second approa ..."
Abstract
 Add to MetaCart
Abstract: Communication issues remain the key for the development of Distributed Memory MIMD cornputers. Two main approaches prevail in the search for adequate communication paradigms: the use of static or reconfigurable interconnection networks. In this paper, we are interested in the second approach. Reconfigurable Distributed Memory MIMD computers with a large number of processors need multistage switching networks to interconnect the processors. The Clos rearrangeable switching network belongs to the most used in the industry. For this family of switching networks the literature proposes several End of control algorithms. The decomposition of the interconnection matrix of the switches, induced by a given configuration, into permutation matrices constitutes an interesting approach. For each permutation matrix the algorithms of this class proceed in two phases whose the second, the most costly expensive, needs for a mxm interconnection matrix at most m/2 iteration steps. This paper discusses two modifications of these algorithms. The Erst results in an algorithm whose second phase needs less than m/3 iteration steps instead of d2. When after al1 second phase is needed, the second modification shows how to carry out it according to the divide and conquer strategy.
The Main Questions to be Asked in order to Characterize an ATM Switch Performance Model
, 1997
"... Performance of ATM networks will depend on switch performance and architecture. In this paper a set of eleven questions is presented in order to be able to study the performance. It is necessary to clearly identify the main characteristics of the switch architecture and the traffic that are consider ..."
Abstract
 Add to MetaCart
Performance of ATM networks will depend on switch performance and architecture. In this paper a set of eleven questions is presented in order to be able to study the performance. It is necessary to clearly identify the main characteristics of the switch architecture and the traffic that are considered. They concern the level of study, the placement of the memory, the monopath or multipath architecture, the uniformity of input traffic, the destination hypotheses, the routing rules inside the switch, the eventual resequencing... In case of cell level study and output buffers, the answers to the six main questions lead to a tree including 14 realistic switch and traffic types. A survey of the modelling method for these cases is then presented. A few results are given in some of the realistic cases and some general conclusions appear to be pertinent: importance of traffic dissymetry, best performance for multipath switches, use of path reservations inside the switch. Keywords : A.T.M, swi...
unknown title
"... The design of embedded VLSI systems poses manychallenges in areas such as design complexity, low power ..."
Abstract
 Add to MetaCart
The design of embedded VLSI systems poses manychallenges in areas such as design complexity, low power