Results 1  10
of
32
Synchronization and linearity: an algebra for discrete event systems
, 2001
"... The first edition of this book was published in 1992 by Wiley (ISBN 0 471 93609 X). Since this book is now out of print, and to answer the request of several colleagues, the authors have decided to make it available freely on the Web, while retaining the copyright, for the benefit of the scientific ..."
Abstract

Cited by 309 (10 self)
 Add to MetaCart
The first edition of this book was published in 1992 by Wiley (ISBN 0 471 93609 X). Since this book is now out of print, and to answer the request of several colleagues, the authors have decided to make it available freely on the Web, while retaining the copyright, for the benefit of the scientific community. Copyright Statement This electronic document is in PDF format. One needs Acrobat Reader (available freely for most platforms from the Adobe web site) to benefit from the full interactive machinery: using the package hyperref by Sebastian Rahtz, the table of contents and all LATEX crossreferences are automatically converted into clickable hyperlinks, bookmarks are generated automatically, etc.. So, do not hesitate to click on references to equation or section numbers, on items of thetableofcontents and of the index, etc.. One may freely use and print this document for one’s own purpose or even distribute it freely, but not commercially, provided it is distributed in its entirety and without modifications, including this preface and copyright statement. Any use of thecontents should be acknowledged according to the standard scientific practice. The
Models of Machines and Computation for Mapping in Multicomputers
, 1993
"... It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonlyaccepted framework ..."
Abstract

Cited by 80 (1 self)
 Add to MetaCart
It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonlyaccepted framework whereby results in the field can be compared. Nor is it always easy to assess the relevance of a new result to a particular problem. Furthermore, changes in parallel computing technology have made some of the earlier work of less relevance to current multiprocessor systems. Versions of the mapping problem are classified, and research in the field is considered in terms of its relevance to the problem of programming currently available hardware in the form of a distributed memory multiple instruction stream multiple data stream computer: a multicomputer.
The Influence of Random Delays on Parallel Execution Times
 IN PROC. 1993 ACM SIGMETRICS CONF. ON MEASUREMENT AND MODELLING OF COMPUTER SYSTEMS
, 1993
"... Stochastic models are widely used for the performance evaluation of parallel programs and systems. The stochastic assumptions in such models are intended to represent nondeterministic processing requirements as well as random delays due to interprocess communication and resource contention. In t ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
Stochastic models are widely used for the performance evaluation of parallel programs and systems. The stochastic assumptions in such models are intended to represent nondeterministic processing requirements as well as random delays due to interprocess communication and resource contention. In this paper, we provide compelling analytical and experimental evidence that in current and foreseeable sharedmemory programs, communication delays introduce negligible variance into the execution time between synchronization points. Furthermore, we show using direct measurements of variance that other sources of randomness, particularly nondeterministic computational requirements, also do not introduce significant variance in many programs. We then use two examples to demonstrate the implications of these results for parallel program performance prediction models, as well as for general stochastic models of parallel systems.
Mean Value Technique for Closed ForkJoin Networks
 PROCEEDINGS OF ACM SIGMETRICS CONFERENCE ON MEASUREMENT AND MODELING OF COMPUTER SYSTEMS
, 1999
"... A simple technique for computing mean performance measures of closed singleclass forkjoin networks with exponential service time distribution is given here. This technique is similar to the mean value analysis technique for closed productform networks and iterates on the number of customers in t ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
A simple technique for computing mean performance measures of closed singleclass forkjoin networks with exponential service time distribution is given here. This technique is similar to the mean value analysis technique for closed productform networks and iterates on the number of customers in the network. Mean performance measures like the mean response times, queue lengths, and throughput of closed forkjoin networks can be computed recursively without calculating the steadystate distribution of the network. The technique is based on the mean value equation for forkjoin networks which relates the response time of a network to the mean service times at the service centers and the mean queue length of the system with one customer less. Unlike productform networks, the mean value equation for forkjoin networks is an approximation and the technique computes lower performance bound values for the forkjoin network. However, it is a good approximation since the mean value equation is derived from an equation that exactly relates the response time of parallel systems to the degree of parallelism and the mean arrival queue length. Using simulation, it is shown that the relative error in the approximation is less than 5% in most cases. The error does not increase with each iteration.
Polling Systems with Synchronization Constraints
, 1992
"... We introduce a new service discipline, called the synchronized gated discipline, for polling systems. It arises when there are precedence (or synchronization) constraints between the order that jobs in different queues should be served. These constraints are described as follows: There are N station ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
(Show Context)
We introduce a new service discipline, called the synchronized gated discipline, for polling systems. It arises when there are precedence (or synchronization) constraints between the order that jobs in different queues should be served. These constraints are described as follows: There are N stations which are "fathers" of (zero or more) synchronized stations ("children"). Jobs that arrive at synchronized stations have to be processed only after jobs that arrived prior to them at their corresponding "father" station have been processed. We analyze the performance of the synchronized gated discipline and obtain expressions for the first two moments and the LaplaceStieltjes transform (LST) of the waiting times in different stations, and expressions for the moments and LST of other quantities of interest, such as cycle duration and generalized station times. We also obtain a "pseudo" conservation law for the synchronized gated discipline, and determine the optimal network topology that minimizes the weighted sum of the mean waiting times, as defined in the "pseudo" conservation law. Numerical examples are given for illustrating the dependence of the performance of the synchronized gated discipline on different parameters of the network.
Equivalence, Reversibility, Symmetry and Concavity Properties in ForkJoin Queueing Networks with Blocking
, 1993
"... In this paper we study quantitative as well as qualitative properties of ForkJoin Queueing Networks with Blocking (FJQN/B's). Specifically, we prove results regarding the equivalence of the behavior of a FJQN/B and that of its duals and a strongly connected marked graph. In addition, we obt ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
In this paper we study quantitative as well as qualitative properties of ForkJoin Queueing Networks with Blocking (FJQN/B's). Specifically, we prove results regarding the equivalence of the behavior of a FJQN/B and that of its duals and a strongly connected marked graph. In addition, we obtain general conditions that must be satisfied by the service times to guarantee the existence of a long term throughput and its independence on the initial configuration. We also establish conditions under which the reverse of a FJQN/B has the same throughput as the original network. By combining the equivalence result for duals and the reversibility result, we establish a symmetry property for the throughput of a FJQN/B. Last, we establish that the throughput is a concave function of the buffer sizes and the initial marking, provided that the service times are mutually independent random variables belonging to the class of PERT distributions that includes the Erlang distributions. This la...
Integrated performance models for SPMD applications and MIMD architectures
 IEEE Trans. on Parallel and Distributed Systems
, 2002
"... Abstract—This paper introduces queuing network models for the performance analysis of SPMD applications executed on generalpurpose parallel architectures such as MIMD and clusters of workstations. The models are based on the pattern of computation, communication, and I/O operations of typical parall ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Abstract—This paper introduces queuing network models for the performance analysis of SPMD applications executed on generalpurpose parallel architectures such as MIMD and clusters of workstations. The models are based on the pattern of computation, communication, and I/O operations of typical parallel applications. Analysis of the models leads to the definition of speedup surfaces which capture the relative influence of processors and I/O parallelism and show the effects of different hardware and software components on the performance. Since the parameters of the models correspond to measurable program and hardware characteristics, the models can be used to anticipate the performance behavior of a parallel application as a function of the target architecture (i.e., number of processors, number of disks, I/O topology, etc). Index Terms—Single program multiple data (SPMD), multiple instruction multiple data (MIMD), performance model, queuing network model, forkjoin queues, mean value analysis (MVA), parallel I/O, synchronization overhead, speedup surface. 1
On Submodular Value Functions of Dynamic Programming
, 1995
"... We investigate in this paper submodular properties of the value function arrizing in complex Dynamic programming (DPs). We consider in particular DPs that include concatenation and linear combinations of standard DP operators, as well as combination of maximizations and minimizations. These DPs have ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We investigate in this paper submodular properties of the value function arrizing in complex Dynamic programming (DPs). We consider in particular DPs that include concatenation and linear combinations of standard DP operators, as well as combination of maximizations and minimizations. These DPs have many applications and interpretations, both in stochastic control (and stochastic zerosum games) as well as in the analysis of (noncontrolled) discreteevent dynamic systems. The submodularity implies the monotonicity of the selectors appearing in the DP equations, which translates, in the context of stochastic control and stochastic games, to monotone optimal policies. Our work is based on the scorespace approach of Glasserman and Yao.
Bounds on the Speedup and Efficiency of Partial Synchronization in Parallel Processing Systems
 Journal of the ACM
, 1993
"... In this paper, we derive bounds on the speedup and efficiency of applications that schedule tasks on a set of parallel processors. We assume that the application runs an algorithm that consists of N iterations and before starting its i + 1'st iteration, a processor must wait for data (i.e., ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we derive bounds on the speedup and efficiency of applications that schedule tasks on a set of parallel processors. We assume that the application runs an algorithm that consists of N iterations and before starting its i + 1'st iteration, a processor must wait for data (i.e., synchronize) calculated in the i'th iteration by a subset of the other processors of the system. Processing times and interconnections between iterations are modeled by random variables with possibly deterministic distributions. Scientific applications consisting of iterations of recursive equations are examples of applications that can be modeled within this formulation. We consider the efficiency of such applications and show that, although efficiency decreases with an increase in the number of processors, it has a nonzero limit when the number of processors increases to infinity. We obtain a lower bound for the efficiency by solving a equation which depends on the distribution of task ...
Introduction to Probabilistic Performance Modelling of Parallel Applications
 in Proc. Parallel Computing '93
, 1993
"... This report describes the results of preliminary research in the field of probabilistic performance modelling of parallel applications. The work was carried out as part of the ProcMod (Processor Modelling) subproject of the parTool project. The ProcMod subproject aims at the development of a perfo ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
This report describes the results of preliminary research in the field of probabilistic performance modelling of parallel applications. The work was carried out as part of the ProcMod (Processor Modelling) subproject of the parTool project. The ProcMod subproject aims at the development of a performance modelling technique and associated tool support which, based on a generic machine modelling paradigm, predicts parallel application performance at different hierarchical levels. In this way, performance feedback is available at all modelling levels, enabling the use of performance information during all stages of the application development process. The application of this technique is twofold: on the one hand, performance can be optimised by means of feedback to the user (or parallelising compiler) for a given target machine (or machine class); on the other hand, a comparative analysis of target machine performance can be made given a parallel application (or application class). In an earlier report the emphasis was on the description of parallel computer architectures [18]. This report, on the other hand, concentrates on the performance modelling and prediction techniques