Results 11  20
of
33
The M/M/1 forkjoin queue with variable subtasks
"... The forkjoin queue models parallel resources where arriving jobs divide into various number of subtasks that are assigned to unique devices within the parallel resource. Each device in the parallel resource is modeled ¢¡ £ ¢¡¥ ¤ by queueing servers. A job completes execution and departs the para ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
The forkjoin queue models parallel resources where arriving jobs divide into various number of subtasks that are assigned to unique devices within the parallel resource. Each device in the parallel resource is modeled ¢¡ £ ¢¡¥ ¤ by queueing servers. A job completes execution and departs the parallel resource after all its subtasks complete execution. This paper analyzes ¦server forkjoin queues where arriving jobs divide into ¤¨§�©�§ are assigned to unique servers of the forkjoin queue. There is no known closedform solution for ¦��� � forkjoin queues. The paper presents an O(log K) algorithm for computing the mean response time pessimistic and optimistic bounds and for computing the mean response time approximation of the forkjoin queue. The error bounds for the response time bounds and approximation are presented. Index Terms: forkjoin synchronization, performance evaluation, parallel computer and storage systems. 1
Eventbased performance perturbation: a case study
 In PPOPP ’91: Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
, 1991
"... Determining the performance behavior of parallel computations requires some form of intrusive tracing measurement. The greater the need for detailed performance data, the more intrusion the measurement will cause. Recovering actual execution performance jfrom perturbed performance measurements us ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Determining the performance behavior of parallel computations requires some form of intrusive tracing measurement. The greater the need for detailed performance data, the more intrusion the measurement will cause. Recovering actual execution performance jfrom perturbed performance measurements using eventbased perturbation analysis is the topic of this paper. We show that the measurement and subsequent analysis of synchronization operations (particularly, advance and await) can produce, in practice, accurate approximations to actual performance behavior. We use as testcases three Lawrence Livermore loops that execute as parallel DOACROSS loops on an Alliant FX/80. The results of our experiments suggest that a systematic application of performance perturbation analysis techniques will allow more detailed, accurate instrumentation than traditionally believed possible. 1
PerPreT  A Performance Prediction Tool for Massively Parallel Systems
 PROCEEDINGS OF THE JOINT CONFERENCE ON PERFORMANCE TOOLS / MMB 1995
, 1995
"... Today's massively parallel machines are typically message passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task. The Performance Prediction Tool (PerPreT) presented in this paper is useful for ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Today's massively parallel machines are typically message passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task. The Performance Prediction Tool (PerPreT) presented in this paper is useful for system designers and application developers. The system designers can use the tool to examine the effects of changes of architectural parameters on parallel applications (e.g., reduction of setup time, increase of link bandwidth, faster execution units). Application developers are interested in a fast evaluation of different parallelization strategies of their codes. PerPreT uses a relatively simple analytical model to predict speedup, execution time, computation time, and communication time for a parametrized application. Especially for large numbers of processors, PerPreT's analytical model is preferable to traditional models (e.g., Markov based approaches such as queueing and Petri net models)...
Queueingtheoretic solution methods for models of parallel and distributed systems
 Performance Evaluation of Parallel and Distributed Systems Solution Methods. CWI Tract 105 & 106
, 1994
"... This paper aims to give an overview of solution methods for the performance analysis of parallel and distributed systems. After a brief review of some important general solution methods, we discuss key models of parallel and distributed systems, and optimization issues, from the viewpoint of solutio ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
This paper aims to give an overview of solution methods for the performance analysis of parallel and distributed systems. After a brief review of some important general solution methods, we discuss key models of parallel and distributed systems, and optimization issues, from the viewpoint of solution methodology.
TwoMoment Approximations for Throughput and Mean Queue Length of a Fork/Join Station with General Inputs from Finite Populations”, to appear in Stochastic Modeling and Optimization
 of Manufacturing Systems and Supply Chains, J.G. Shanthikumar, D.D. Yao and W.H.M. Zijm (Eds.), Kluwer International Series in Operations Research and Management Science
, 2002
"... 1 2 1. ..."
(Show Context)
Computing Performance Bounds for ForkJoin Queueing Models
, 1994
"... We study a computer system which accepts parallel programs which can be modeled using the forkjoin computational paradigm. The system under study has K homogeneous servers, each having an infinite capacity queue. Jobs arrive to the system according to a general interarrival process with mean arriva ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We study a computer system which accepts parallel programs which can be modeled using the forkjoin computational paradigm. The system under study has K homogeneous servers, each having an infinite capacity queue. Jobs arrive to the system according to a general interarrival process with mean arrival rate . Upon arrival, the job is split into K independent tasks t i ; 1 i K and task t i is assigned to the i th server. Each task requires a mean service time of 1=. Each server uses the FirstComeFirstServe (FCFS) scheduling discipline to service its tasks. A job is complete upon the completion of its last task. This kind of queueing model has no known closed form solution in the general (K 2) case. Rather than complete modification of the arrival and service distributions to obtain bounds on job response time on a forkjoin queueing model as reported in previous literature, we show that by modifying the arrival and service distributions at some imbedded points in time, we can obtain better performance bounds. We also provide an efficient algorithm that can compute upper and lower bounds on the expected response time of jobs in a forkjoin queueing model. The methodology presented allows one to tradeoff the tightness of the bounds and computational cost. Examples are presented which show the excellent relative accuracy achievable with modest computational cost.
A Customized MVA Model for ILP Multiprocessors
, 1998
"... This paper provides the customized MVA equations for an analytical model for evaluating architectural alternatives for sharedmemory multiprocessors with processors that aggressively exploit instructionlevel parallelism (ILP). Compared to simulation, the analytical model is many orders of magnitude ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
This paper provides the customized MVA equations for an analytical model for evaluating architectural alternatives for sharedmemory multiprocessors with processors that aggressively exploit instructionlevel parallelism (ILP). Compared to simulation, the analytical model is many orders of magnitude faster to solve, yielding highly accurate system performance estimates in seconds. 1
Correspondence Performance Based Design of HighLevel LanguageDirected Computer Architectures
"... Abstract — This paper is concerned with the analytical modeling of computer architectures to aid in the design of highlevel languagedirected computer architectures. Highlevel languagedirected computers are computers that execute programs in a highlevel language directly. The design procedure of ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — This paper is concerned with the analytical modeling of computer architectures to aid in the design of highlevel languagedirected computer architectures. Highlevel languagedirected computers are computers that execute programs in a highlevel language directly. The design procedure of these computers are at best described as being ad hoc. In order to systematize the design procedure, we introduce analytical models of computers that predict the performance of parallel computations on concurrent computers. We model computers as queueing networks and parallel computations as precedence graphs. The models that we propose are simple and lead to computationally efficient procedures of predicting the performance of parallel computations on concurrent computers. We demonstrate the use of these models in the design of highlevel languagedirected computer architectures. I.
BY
, 2004
"... Approval of the Graduate School of Natural and Applied Sciences Prof. Dr. Canan Özgen ..."
Abstract
 Add to MetaCart
Approval of the Graduate School of Natural and Applied Sciences Prof. Dr. Canan Özgen