Results 1 - 10
of
29
Distributed hierarchical control for parallel processing
- Computer
, 1990
"... he development of operating systems for parallel computers has closely followed that for serial computers. At first, the most advanced parallel computers ran in batch mode or single-user mode. At best, they allowed a static partitioning among a number of users. They were typically designed with a sp ..."
Abstract
-
Cited by 70 (12 self)
- Add to MetaCart
he development of operating systems for parallel computers has closely followed that for serial computers. At first, the most advanced parallel computers ran in batch mode or single-user mode. At best, they allowed a static partitioning among a number of users. They were typically designed with a specific computational task in mind or for a certain class of computations. Like serial computers, they are currently evolving towards general-purpose, interactive, multiuser parallel systems. To explain the underlying motivation for our work, we note that a general-purpose, interactive, multiuser, multiprogramming parallel environment has the following advantages (in addition to the traditional advantages in uniprocessor environments, such as cost effectiveness): l This environment provides users with a spectrum of computational powers, cov-ering the range from personal computers to supercomputers. A user requiring more computational power can simply use more processors. Thus, a short response time for both simple and computationally intensive tasks is possible. l The spectrum of powers also aids pro-gram development and evaluation. Ini-tially, only one processor is needed. Addi-tional processors can be added later with-
Non-blocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1998
"... Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their uti-lization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two pri ..."
Abstract
-
Cited by 65 (1 self)
- Add to MetaCart
Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their uti-lization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for concurrent, atomic update of shared data structures: (1) preemption-safe locking and (2) non-blocking (lock-free) algorithms. Preemption-safe locking requires kernel support. Non-blocking algorithms generally require a universal atomic primitive such as compare-and-swap orload-linked/store-conditional, and are widely regarded as inefficient. We evaluate the performance of preemption-safe lock-based and non-blocking implementations of important data structures—queues, stacks, heaps, and counters—including non-blocking and lock-based queue algorithms of our own, in micro-benchmarks and real applications on a 12-processor SGI Challenge multiprocessor. Our results indicate that our non-blocking queue consistently outperforms the best known alternatives, and that data-structure-specific non-blocking algorithms, which exist for queues, stacks, and counters, can work extremely well. Not only do they outperform preemption-safe lock-based algorithms on multiprogrammed machines, they also outperform ordinary locks on dedicated machines. At the same time, since general-purpose non-blocking techniques do not yet appear to be practical, preemption-safe locks remain the preferred alternative for complex data structures: they outperform
Information and Computation: Classical and Quantum Aspects
- REVIEWS OF MODERN PHYSICS
, 2001
"... Quantum theory has found a new field of applications in the realm of information and computation during the recent years. This paper reviews how quantum physics allows information coding in classically unexpected and subtle nonlocal ways, as well as information processing with an efficiency largely ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
Quantum theory has found a new field of applications in the realm of information and computation during the recent years. This paper reviews how quantum physics allows information coding in classically unexpected and subtle nonlocal ways, as well as information processing with an efficiency largely surpassing that of the present and foreseeable classical computers. Some outstanding aspects of classical and quantum information theory will be addressed here. Quantum teleportation, dense coding, and quantum cryptography are discussed as a few samples of the impact of quanta in the transmission of information. Quantum logic gates and quantum algorithms are also discussed as instances of the improvement in information processing by a quantum computer. We provide finally some examples of current experimental
Least Common Ancestor Networks
, 1993
"... Least Common Ancestor Networks (LCANs) are introduced and shown to be a class of networks that include fat-trees, baseline networks, SW-banyans and the router networks of the TRAC 1.1 and 2.0, and the CM-5. Some LCAN properties are stated and the permutation routing capabilities of an important subc ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Least Common Ancestor Networks (LCANs) are introduced and shown to be a class of networks that include fat-trees, baseline networks, SW-banyans and the router networks of the TRAC 1.1 and 2.0, and the CM-5. Some LCAN properties are stated and the permutation routing capabilities of an important subclass are analyzed. Simulation results for three permutation classes verify the accuracy of an iterative solution for a randomized routing strategy.
Project Mars: Scalable, High Performance, Web Based Multimedia-On-Demand (MOD) Services And Servers
, 1998
"... This dissertation describes cost-effective design and implementation of scalable web based high performance multimedia-on-demand (MOD) servers and services. An important aspect of this dissertation has been prototyping, deploying MOD applications, services and servers and learning from this experie ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
This dissertation describes cost-effective design and implementation of scalable web based high performance multimedia-on-demand (MOD) servers and services. An important aspect of this dissertation has been prototyping, deploying MOD applications, services and servers and learning from this experience. The three main components of this dissertation are (1) Web based interactive MOD services, (2) innovative enhancements to a server node operating system (OS) to support such MOD services, and (3) design and prototyping of a scalable server architecture and associated data layout and scheduling schemes to support a large number of independent, concurrent clients. We first describe design and prototyping of two example multimedia-on-demand services, namely interactive recording service for content crea...
Compile-Time Scheduling of Dataflow Program graphs with Dynamic Constructs
- University of California, Berkeley
, 1992
"... by ..."
The N-Body Problem: Distributed System Load Balancing And Performance Evaluation
- In Proceedings of the 6th International Conference on Parallel and Distributed Computing Systems
, 1993
"... this paper, the N-body simulation problem is considered, its parallel implementation is described, its execution time performance is modeled and compared with measured results, and two alternative load balancing algorithms for enhancing performance are investigated. Parallel N-body techniques are wi ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
this paper, the N-body simulation problem is considered, its parallel implementation is described, its execution time performance is modeled and compared with measured results, and two alternative load balancing algorithms for enhancing performance are investigated. Parallel N-body techniques are widely applied in a number of fields[9] ranging from astrophysics, to fluid dynamics, to computational geometry[7]. They require dynamically changing, non-uniform, intensive computation and irregular, unstructured communication. They are therefore good candidates for use as parallel computing benchmarks[8] and the results presented in this paper are part of an ongoing effort at developing such parallel benchmarks. In addition, the N-body simulation algorithm presented here is prototypical of a wide class of algorithms referred to as synchronous iterative algorithms. The models and results given in this paper apply to much of this class. N-body simulation algorithm has been implemented on a network of SUN workstations connected by standard ethernet, running under the PVM[13] environment. Performance has been measured and models verified in this manner. An important step in the design of high performance computing systems is to study possible enhancements to the existing system and estimate the performance implications of these enhancements. Similarly, certain modifications to the algorithms may also lead to significant performance improvements. To evaluate the these modifications and their performance implications, a general performance model is needed that can predict the performance of an algorithm on a particular system. This paper presents a general framework for developing simple mean-value oriented performance models for a class of synchronous iterative algorithms. The per...
A System to Read Names and Addresses on Tax Forms
, 1996
"... The reading of names and addresses is one of the most complex tasks in automated forms processing. This paper describes an integrated real-time system to read names and addresses on tax forms of the Internal Revenue Service of the United States. The Name and Address Block Reader #NABR# system acc ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The reading of names and addresses is one of the most complex tasks in automated forms processing. This paper describes an integrated real-time system to read names and addresses on tax forms of the Internal Revenue Service of the United States. The Name and Address Block Reader #NABR# system accepts both machine-printed and handprinted address block images as input. The application software has two major steps: document analysis #connected component analysis, address block extraction, label detection, hand-print#machine-print discrimination# and document recognition. Document recognition has two non-identical streams for machine-print and hand-print; key steps are: address parsing, character recognition, word recognition and postal database lookup #ZIP+4 and City-State-ZIP #les#. System output is a packet containing the results of recognition together with database access status #le. Real-time throughput #8,500 forms per hour# is achieved by employing a loosely-coupled mult...
A Study of an Evaluation Methodology for Unbuffered Multistage Interconnection Networks
- In Proceedings of 17th International Parallel and Distributed Processing Symposium, IPDPS’03
, 2003
"... Interconnection network performance is a key factor when constructing parallel computers. The choice of an interconnection network used in a parallel computer depends on a large number of performance factors which are very often applications dependent. We try in this paper to give the outlines of a ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Interconnection network performance is a key factor when constructing parallel computers. The choice of an interconnection network used in a parallel computer depends on a large number of performance factors which are very often applications dependent. We try in this paper to give the outlines of a performance evaluation and comparison methodology using what we think of as the most important parameters to be considered when solving such a problem. This methodology is applied on a new interconnection network called MCRB network and on Omega network.

