Results 1  10
of
19
RAID: HighPerformance, Reliable Secondary Storage
 ACM COMPUTING SURVEYS
, 1994
"... Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to o ..."
Abstract

Cited by 298 (6 self)
 Add to MetaCart
Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to organize current and future work. The paper first introduces disk technology and reviews the driving forces that have popularized disk arrays: performance and reliability. It then discusses the two architectural techniques used in disk arrays: striping across multiple disks to improve performance and redundancy to improve reliability. Next, the paper describes seven disk array architectures, called RAID (Redundant Arrays of Inexpensive Disks) levels 06 and compares their performance, cost, and reliability. It goes on to discuss advanced research and implementation topics such as refining the basic RAID levels to improve performance and designing algorithms to maintain data consistency. Last, the paper describes six disk array prototypes or products and discusses future opportunities for research. The paper includes an annotated bibliography of disk arrayrelated literature.
LoPC: Modeling Contention in Parallel Algorithms
, 1997
"... Parallel algorithm designers need computational models that take first order system costs into account, but are also simple enough to use in practice. This paper introduces the LoPC model, which is inspired by the LogP model but accounts for contention for message processing resources in parallel al ..."
Abstract

Cited by 45 (9 self)
 Add to MetaCart
Parallel algorithm designers need computational models that take first order system costs into account, but are also simple enough to use in practice. This paper introduces the LoPC model, which is inspired by the LogP model but accounts for contention for message processing resources in parallel algorithms on a multiprocessor or network of workstations. LoPC takes the , and parameters directly from the LogP model and uses them to predict the cost of contention, .
Analytic evaluation of sharedmemory systems with ilp processors
 In ISCA ’98: Proceedings of the 25th annual international symposium on Computer architecture
, 1998
"... This paper develops and validates an analytical model for evaluating various types of architectural alternatives for sharedmemory systems with processors that aggressively exploit instructionlevel parallelism. Compared to simulation, the analytical model is many orders of magnitude faster to solve ..."
Abstract

Cited by 38 (2 self)
 Add to MetaCart
This paper develops and validates an analytical model for evaluating various types of architectural alternatives for sharedmemory systems with processors that aggressively exploit instructionlevel parallelism. Compared to simulation, the analytical model is many orders of magnitude faster to solve, yielding highly accurate system performance estimates in seconds. The model input parameters characterize the ability of an application to exploit instructionlevel parallelism as well as the interaction between the application and the memory system architecture. A tracedriven simulation methodology is developed that allows these parameters to be generated over 100 times faster than with a detailed executiondriven simulator. Finally, this paper shows that the analytical model can be used to gain insights into application performance and to evaluate architectural design tradeoffs. 1
Using the exact state space of a Markov model to compute approximate stationary measures
 Proc. 2000 ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems
, 2000
"... We present a new approximation algorithm based on an exact representation of the state space S, using decision diagrams, and of the transition rate matrix R, using Kronecker algebra, for a Markov model with K submodels. Our algorithm builds and solves K Markov chains, each corresponding to a differe ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
We present a new approximation algorithm based on an exact representation of the state space S, using decision diagrams, and of the transition rate matrix R, using Kronecker algebra, for a Markov model with K submodels. Our algorithm builds and solves K Markov chains, each corresponding to a different aggregation of the exact process, guided by the structure of the decision diagram, and iterates on their solution until their entries are stable. We prove that exact results are obtained if the overall model has a productform solution. Advantages of our method include good accuracy, low memory requirements, fast execution times, and a high degree of automation, since the only additional information required to apply it is a partition of the model into the K submodels. As far as we know, this is the first time an approximation algorithm has been proposed where knowledge of the exact state space is explicitly used. 1.
Nmap: A virtual processor discrete event simulation tool for performance predicition in capse
 In 28th Annual Hawaii International Conference on Systems Sciences
, 1995
"... The CAPSE (Computer Aided Parallel Software Engineering) environment aims to assist a performance oriented parallel program development approach by integrating tools for performance prediction in the design phase, analytical or simulation based performance analysis in the detailed specification an ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
The CAPSE (Computer Aided Parallel Software Engineering) environment aims to assist a performance oriented parallel program development approach by integrating tools for performance prediction in the design phase, analytical or simulation based performance analysis in the detailed specification and coding phase, and finally monitoring in the testing and correction phase. In this work, the NMAP tool as part of the CAPSE environment is presented. NMAP covers the crucial aspect of performance prediction to support a performance oriented, incremental development process of parallel applications such that implementation design choices can be investigated far ahead of the full coding of the application. Methodologically, NMAP in an automatic parse and translate step generates a simulation program from a skeletal SPMD program, with which the programmer expresses just the constituent and performance critical program parts, subject to an incremental refinement. The simulated execution of the SPMD skeleton supports a variety of performance studies. We demonstrate the use and performance of the NMAP tool by developing a linear system solver for the CM5. 1
Analyzing Concurrent and FaultTolerant Software using Stochastic Reward Nets
 Journal of Parallel and Distributed Computing
, 1992
"... We present two software applications and develop models for them. The first application considers a producerconsumer tasking system with an intermediate buffer task and studies how the performance is affected by different selection policies when multiple tasks are ready to synchronize. The second a ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
We present two software applications and develop models for them. The first application considers a producerconsumer tasking system with an intermediate buffer task and studies how the performance is affected by different selection policies when multiple tasks are ready to synchronize. The second application studies the reliability of a faulttolerant software system using the recovery block scheme. The model is incrementally augmented by considering clustered failures or the effective arrival rate of inputs to the system. We use stochastic reward nets, a variant of stochastic Petri nets, to model the two software applications. In both models, each quantity to be computed is defined in terms of either the expected value of a reward rate in steadystate or at a given time `, or as the expected value of the accumulated reward until absorption or until a given time `. This allows extreme flexibility while maintaning a rigorous formalization of these quantities. 1 Introduction Many appli...
On Performance Prediction of Parallel Computations with Precedent Constraints
, 1994
"... Performance analysis of concurrent executions in parallel systems has been recognized as a challenging problem. The aim of this research is to study approximate but ecient solution techniques for this problem. We model the structure of a parallel machine and the structure of the jobs executing on ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Performance analysis of concurrent executions in parallel systems has been recognized as a challenging problem. The aim of this research is to study approximate but ecient solution techniques for this problem. We model the structure of a parallel machine and the structure of the jobs executing on such a system. We investigate rich classes of jobs, which can be expressed by series, paralleland, parallelor, and probabilisticfork. We propose an efficient performance prediction method for these classes of jobs running on a parallel environment which is modeled by a standard queueing network model. The proposed prediction method is computationally efficient, it has polynomial complexity in both time and space. The time complexity is O(C²N²K) and the space complexity is O(C²N²K), where C is the number of job classes in the system, the number of tasks in each job class is O(N), and K is the number of service centers in the queueing model. The accuracy of the approxi...
Computing Performance Bounds of ForkJoin Parallel Programs Under a Multiprocessing Environment
 IEEE TRANS. ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1998
"... We study a multiprocessing computer system which accepts parallel programs that have a forkjoin computational paradigm. The multiprocessing computer system under study is modeled as K homogeneous servers, each with an infinite capacity queue. Parallel programs arrive at the multiprocessing system a ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
We study a multiprocessing computer system which accepts parallel programs that have a forkjoin computational paradigm. The multiprocessing computer system under study is modeled as K homogeneous servers, each with an infinite capacity queue. Parallel programs arrive at the multiprocessing system according to a seriesparallel phase type interarrival process with mean arrival rate of l. Upon the program arrival, it forks into K independent tasks and each task is assigned to an unique server. Each task's service time has a kstage Erlang distribution with mean service time of 1/m. A parallel program is completed upon the completion of its last task. This kind of queuing model has no known closed form solution in the general (K # 2) case. In this paper, we show that by carefully modifying the arrival and service distributions at some imbedded points in time, we can obtain tight performance bounds. We also provide a computational efficient algorithm for obtaining upper and lower bounds o...
An Analytic Performance Model of Disk Arrays and its Applications
, 1991
"... As disk arrays become widely used, tools for understanding and analyzing their performance become increasingly important. In particular, performance models can be invaluable in both con guring and designing disk arrays. Accurate analytic performance models are desirable over other types of models be ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
As disk arrays become widely used, tools for understanding and analyzing their performance become increasingly important. In particular, performance models can be invaluable in both con guring and designing disk arrays. Accurate analytic performance models are desirable over other types of models because they can be quickly evaluated, are applicable under a wide range of system and workload parameters, and can be manipulated by a range of mathematical techniques. Unfortunately, analytic performance models of disk arrays are di cult to formulate due to the presence of queuing and forkjoin synchronization; a disk array request is broken up into independent disk requests which must all complete to satisfy the original request. In this paper, we develop, validate and apply an analytic performance model for disk arrays. We derive simple equations for approximating their utilization, response time and throughput. We then validate the analytic model via simulation and investigate the accuracy of each approximation used in deriving the analytic model. Finally, we apply the analytic model to derive an equation for the optimal unit of data striping in disk arrays. 1