Results 1  10
of
97
Complexity Measures and Decision Tree Complexity: A Survey
 Theoretical Computer Science
, 2000
"... We discuss several complexity measures for Boolean functions: certificate complexity, sensitivity, block sensitivity, and the degree of a representing or approximating polynomial. We survey the relations and biggest gaps known between these measures, and show how they give bounds for the decision tr ..."
Abstract

Cited by 123 (14 self)
 Add to MetaCart
We discuss several complexity measures for Boolean functions: certificate complexity, sensitivity, block sensitivity, and the degree of a representing or approximating polynomial. We survey the relations and biggest gaps known between these measures, and show how they give bounds for the decision tree complexity of Boolean functions on deterministic, randomized, and quantum computers. 1 Introduction Computational Complexity is the subfield of Theoretical Computer Science that aims to understand "how much" computation is necessary and sufficient to perform certain computational tasks. For example, given a computational problem it tries to establish tight upper and lower bounds on the length of the computation (or on other resources, like space). Unfortunately, for many, practically relevant, computational problems no tight bounds are known. An illustrative example is the well known P versus NP problem: for all NPcomplete problems the current upper and lower bounds lie exponentially ...
CommunicationEfficient Parallel Sorting
, 1996
"... We study the problem of sorting n numbers on a pprocessor bulksynchronous parallel (BSP) computer, which is a parallel multicomputer that allows for general processortoprocessor communication rounds provided each processor sends and receives at most h items in any round. We provide parallel sort ..."
Abstract

Cited by 64 (2 self)
 Add to MetaCart
We study the problem of sorting n numbers on a pprocessor bulksynchronous parallel (BSP) computer, which is a parallel multicomputer that allows for general processortoprocessor communication rounds provided each processor sends and receives at most h items in any round. We provide parallel sorting methods that use internal computation time that is O( n log n p ) and a number of communication rounds that is O( log n log(h+1) ) for h = \Theta(n=p). The internal computation bound is optimal for any comparisonbased sorting algorithm. Moreover, the number of communication rounds is bounded by a constant for the (practical) situations when p n 1\Gamma1=c for a constant c 1. In fact, we show that our bound on the number of communication rounds is asymptotically optimal for the full range of values for p, for we show that just computing the "or" of n bits distributed evenly to the first O(n=h) of an arbitrary number of processors in a BSP computer requires\Omega\Gammaqui n= log(h...
Contention in Shared Memory Algorithms
, 1993
"... Most complexitymeasures for concurrent algorithms for asynchronous sharedmemory architectures focus on process steps and memory consumption. In practice, however, performance of multiprocessor algorithms is heavily influenced by contention, the extent to which processes access the same location at t ..."
Abstract

Cited by 62 (1 self)
 Add to MetaCart
Most complexitymeasures for concurrent algorithms for asynchronous sharedmemory architectures focus on process steps and memory consumption. In practice, however, performance of multiprocessor algorithms is heavily influenced by contention, the extent to which processes access the same location at the same time. Nevertheless, even though contention is one of the principal considerations affecting the performance of real algorithms on real multiprocessors, there are no formal tools for analyzing the contention of asynchronous sharedmemory algorithms. This paper introduces the first formal complexity model for contention in multiprocessors. We focus on the standard multiprocessor architecture in which n asynchronous processes communicate by applying read, write, and readmodifywrite operations to a shared memory. We use our model to derive two kinds of results: (1) lower bounds on contention for well known basic problems such as agreement and mutual exclusion, and (2) tradeoffs betwe...
Hundreds of Impossibility Results for Distributed Computing
 Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
On the Number of Rounds Necessary to Disseminate Information (Extended Abstract)
 In First ACM Symposium on Parallel Algorithms and Architectures (SPAA
, 1989
"... ) S. Even 13 B. Monien 2 Abstract We study how efficiently information can be spread in a communication network and ask how many rounds it takes until all processors know all pieces of information. This problem has a wellknown solution in the "telephone communication mode", where in each r ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
) S. Even 13 B. Monien 2 Abstract We study how efficiently information can be spread in a communication network and ask how many rounds it takes until all processors know all pieces of information. This problem has a wellknown solution in the "telephone communication mode", where in each round each processor can send or receive only via one of its links and the communication is two way. For the "telegraph communication node", where in each round also each processor is active only via one of its links and the communication is one way, i.e. each processor can either send or receive, up to now only an upper bound was known. We prove a lower bound which differs from the upper bound at most by an additive constant of 1. Our lower bound technique uses elements from matrix theory, especially matrix norms. This result shows for the first time that in the two way mode information can be distributed faster than in the one way mode. We also apply our upper and lower bound techniques for chara...
Parallel Sorting With Limited Bandwidth
 in Proc. 7th ACM Symp. on Parallel Algorithms and Architectures
, 1995
"... We study the problem of sorting on a parallel computer with limited communication bandwidth. By using the recently proposed PRAM(m) model, where p processors communicate through a small, globally shared memory consisting of m bits, we focus on the tradeoff between the amount of local computation an ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
We study the problem of sorting on a parallel computer with limited communication bandwidth. By using the recently proposed PRAM(m) model, where p processors communicate through a small, globally shared memory consisting of m bits, we focus on the tradeoff between the amount of local computation and the amount of interprocessor communication required for parallel sorting algorithms. We prove a lower bound of \Omega\Gamma n log m m ) on the time to sort n numbers in an exclusiveread variant of the PRAM(m) model. We show that Leighton's Columnsort can be used to give an asymptotically matching upper bound in the case where m grows as a fractional power of n. The bounds are of a surprising form, in that they have little dependence on the parameter p. This implies that attempting to distribute the workload across more processors while holding the problem size and the size of the shared memory fixed will not improve the optimal running time of sorting in this model. We also show that bot...
Parallel RAMs with Owned Global Memory and Deterministic ContextFree Language Recognition
, 1997
"... We identify and study a natural and frequently occurring subclass of ConcurrentRead, ExclusiveWrite Parallel Random Access Machines (CREWPRAMs). Called ConcurrentRead, OwnerWrite, or CROWPRAMs, these are machines in which each global memory location is assigned a unique "owner" proc ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
We identify and study a natural and frequently occurring subclass of ConcurrentRead, ExclusiveWrite Parallel Random Access Machines (CREWPRAMs). Called ConcurrentRead, OwnerWrite, or CROWPRAMs, these are machines in which each global memory location is assigned a unique "owner" processor, which is the only processor allowed to write into it. Considering the difficulties that would be involved in physically realizing a full CREWPRAM model, it is interesting to observe that in fact, most known CREWPRAM algorithms satisfy the CROW restriction or can be easily modified to do so. This paper makes three main contributions. First, we formally define the CROWPRAM model and demonstrate its stability
The QueueRead QueueWrite PRAM Model: Accounting for Contention in Parallel Algorithms
 Proc. 5th ACMSIAM Symp. on Discrete Algorithms
, 1997
"... Abstract. This paper introduces the queueread queuewrite (qrqw) parallel random access machine (pram) model, which permits concurrent reading and writing to sharedmemory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. Prior to thi ..."
Abstract

Cited by 24 (10 self)
 Add to MetaCart
Abstract. This paper introduces the queueread queuewrite (qrqw) parallel random access machine (pram) model, which permits concurrent reading and writing to sharedmemory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. Prior to this work there were no formal complexity models that accounted for the contention to memory locations, despite its large impact on the performance of parallel programs. The qrqw pram model reflects the contention properties of most commercially available parallel machines more accurately than either the wellstudied crcw pram or erew pram models: the crcw model does not adequately penalize algorithms with high contention to sharedmemory locations, while the erew model is too strict in its insistence on zero contention at each step. The�qrqw pram is strictly more powerful than the erew pram. This paper shows a separation of log n between the two models, and presents faster and more efficient qrqw algorithms for several basic problems, such as linear compaction, leader election, and processor allocation. Furthermore, we present a workpreserving emulation of the qrqw pram with only logarithmic slowdown on Valiant’s bsp model, and hence on hypercubetype noncombining networks, even when latency, synchronization, and memory granularity overheads are taken into account. This matches the bestknown emulation result for the erew pram, and considerably improves upon the bestknown efficient emulation for the crcw pram on such networks. Finally, the paper presents several lower bound results for this model, including lower bounds on the time required for broadcasting and for leader election.
Combining Tentative and Definite Executions for Very Fast Dependable Parallel Computing (Extended Abstract)
, 1991
"... We present a general and efficient strategy for computing robustly on unreliable parallel machines. The model of a parallel machine that we use is a CRCW PRAM with dynamic resource fluctuations: processors can fail during the computation, and may possibly be restored later. We first introduce the no ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
We present a general and efficient strategy for computing robustly on unreliable parallel machines. The model of a parallel machine that we use is a CRCW PRAM with dynamic resource fluctuations: processors can fail during the computation, and may possibly be restored later. We first introduce the notions of definite and tentative algorithms for executing a single parallel step of an ideal parallel machine on the unreliable machine. A definite algorithm is one that guarantees a correct execution of a
Computational Models of the Utility Problem and their Application to a Utility Analysis of CaseBased Reasoning
 In Proceedings of the Workshop on Knowledge Compilation and SpeedUp Learning
, 1993
"... computational analysis of these factors. We use this method to analyze different types of problem solving systems, including unguided search, control rule, and casebased systems. While some aspects of the utility problem have been studied in some kinds of problem solvers, our analysis provides a ge ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
computational analysis of these factors. We use this method to analyze different types of problem solving systems, including unguided search, control rule, and casebased systems. While some aspects of the utility problem have been studied in some kinds of problem solvers, our analysis provides a general and theoretical framework for addressing this problem in problem solvers that have been studied empirically (e.g., controlrule learning systems) and in problem solvers for which little utility analysis has been performed (e.g., casebased reasoning systems). In particular, our analysis reveals that casebased reasoning systems may suffer from several different utility problems, and suggests coping strategies that would help avoid these problems as these systems are scaled up. The utility problem in learning systems occurs when knowledge learned in an attempt to improve a system's performance degrades performance instead. There are many types of utility problems and many factors that c...