Results 11 -
18 of
18
Distribution Sort with Randomized Cycling
- In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA-01
, 2001
"... Paxallel independent disks can enhance the performance of external memory (EM) algorithms, but the programming task is often difficult. In this paper we develop randomized vaxiants of distribution sort for use with paxallel independent disks. We propose a simple vaxiant called randomized cycling dis ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Paxallel independent disks can enhance the performance of external memory (EM) algorithms, but the programming task is often difficult. In this paper we develop randomized vaxiants of distribution sort for use with paxallel independent disks. We propose a simple vaxiant called randomized cycling distribution sort (RCD) and prove that it has optimal expected I/O complexity. The analysis uses a novel reduction to a model with significantly fewer probabilistic interdependencies. Experimental evidence is provided to support its practicality. Other simple vaxiants axe also examined experimentally and appeax to offer similax advantages to RCD. Based upon ideas in RCD we propose general techniques that transpaxently simulate algorithms developed for the unrealistic multihead disk model so that they can be run on the realistic paxallel disk model. The simulation is optimal for two important classes of algorithms: the class of multipass algorithms, which make a complete pass through their data before accessing any element a second time, and the algorithms based upon the well-known distribution paxadigm of EM computation.
Massively parallel algorithms for private-cache chip multiprocessors. Under submission
, 2008
"... In this paper, we study massively parallel algorithms for private-cache chip multiprocessors (CMPs), focusing on methods for foundational problems that can scale to hundreds or even thousands of cores. By focusing on privatecache CMPs, we show that we can design efficient algorithms that need no add ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
In this paper, we study massively parallel algorithms for private-cache chip multiprocessors (CMPs), focusing on methods for foundational problems that can scale to hundreds or even thousands of cores. By focusing on privatecache CMPs, we show that we can design efficient algorithms that need no additional assumptions about the way that cores are interconnected, for we assume that all inter-processor communication occurs through the memory hierarchy. We study several fundamental problems, including prefix sums, selection, and sorting, which often form the building blocks of other parallel algorithms. Indeed, we present two sorting algorithms, a distribution sort and a mergesort. All algorithms in the paper are asymptotically optimal in terms of the parallel cache accesses and space complexity under reasonable assumptions about the relationships between the number of processors, the size of memory, and the size of cache blocks. In addition, we study sorting lower bounds in a computational model, which we call the parallel external-memory (PEM) model, that formalizes the essential properties of our algorithms for private-cache chip multiprocessors. [Regular paper submission to SPAA 2008, which may be considered for a normal track or the special track on multicore systems.] ∗ Center for Massive Data Algorithmics – a Center of the Danish National Research Foundation 1
A Simple and Efficient Parallel Disk Mergesort
, 2002
"... External sorting—the process of sorting a file that is too large to fit into the computer’s internal memory and must be stored externally on disks—is a fundamental subroutine in database systems [G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up t ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
External sorting—the process of sorting a file that is too large to fit into the computer’s internal memory and must be stored externally on disks—is a fundamental subroutine in database systems [G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up the performance of external sorting. The simple randomized merging (SRM) mergesort algorithm proposed by Barve et al. [BGV] is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice. Knuth [K, Section 5.4.9] recently identified SRM (which he calls “randomized striping”) as the method of choice for sorting with parallel disks. In this paper we present an efficient implementation of SRM, based upon novel and elegant data structures. We give a new implementation for SRM’s lookahead forecasting technique for parallel prefetching and its forecast and flush technique for buffer management. Our techniques amount to a significant improvement in the way SRM carries out the parallel, independent disk accesses necessary to read blocks of input runs efficiently during external merging. Our implementation is
Cleaning Uncertain Data with Quality Guarantees
, 2008
"... Uncertain or imprecise data are pervasive in applications like location-based services, sensor monitoring, and data collection and integration. For these applications, probabilistic databases can be used to store uncertain data, and querying facilities are provided to yield answers with statistical ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Uncertain or imprecise data are pervasive in applications like location-based services, sensor monitoring, and data collection and integration. For these applications, probabilistic databases can be used to store uncertain data, and querying facilities are provided to yield answers with statistical confidence. Given that a limited amount of resources is available to “clean” the database (e.g., by probing some sensor data values to get their latest values), we address the problem of choosing the set of uncertain objects to be cleaned, in order to achieve the best improvement in the quality of query answers. For this purpose, we present the PWS-quality metric, which is a universal measure that quantifies the ambiguity of query answers under the possible world semantics. We study how PWS-quality can be efficiently evaluated for two major query classes: (1) queries that examine the satisfiability of tuples independent of other tuples (e.g., range queries); and (2) queries that require the knowledge of the relative ranking of the tuples (e.g., MAX queries). We then propose a polynomial-time solution to achieve an optimal improvement in PWS-quality. Other fast heuristics are presented as well. Experiments, performed on both real and synthetic datasets, show that the PWS-quality metric can be evaluated quickly, and that our cleaning algorithm provides an optimal solution with high efficiency. To our best knowledge, this is the first work that develops a quality metric for a probabilistic database, and investigates how such a metric can be used for data cleaning purposes.
Parallel External Memory Graph Algorithms
"... In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one of the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to solutions for many problems on trees, such as computing the Euler to ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one of the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to solutions for many problems on trees, such as computing the Euler tour, preorder and postorder numbering of the vertices, the depth of each vertex and the sizes of subtrees rooted at each vertex of the tree. We also study the problems of computing the connected components of a graph and minimum spanning tree of a connected graph. All our solutions provide an optimal speedup of O(p) in parallel I/O complexity compared to the single-processor external memory versions of the algorithms. 1
On sorting strings in external memory (extended abstract
- In STOC ’97: Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
, 1997
"... Abstract. In this paper we address for the first time the I/O complexity of the problem of sorting strings in external memory, which is a fundamental component of many large-scale text applications. In the standard unit-cost RAM comparison model, the complexity of sorting K strings of total length N ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. In this paper we address for the first time the I/O complexity of the problem of sorting strings in external memory, which is a fundamental component of many large-scale text applications. In the standard unit-cost RAM comparison model, the complexity of sorting K strings of total length N is (K log2 K +N). By analogy, in the external memory (or I/O) model, where the internal memory has size M and the block transfer size is B, it would be natural to guess that the I/O complexity of sorting strings is ( K B logM=B K N
Sorting in parallel external-memory multicores
, 2007
"... Abstract. In this paper, we introduce a model for multicore architectures, which takes into explicit consideration the cache-oriented nature of inputs and outputs in modern CPUs. In addition, we study the fundamental problem of sorting comparable items using this model. We provide algorithms that ar ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In this paper, we introduce a model for multicore architectures, which takes into explicit consideration the cache-oriented nature of inputs and outputs in modern CPUs. In addition, we study the fundamental problem of sorting comparable items using this model. We provide algorithms that are efficient in terms of the number of parallel I/O’s. We also provide lower bounds that show that our algorithms are within a constant factor of optimal, for reasonable values of parameters characterizing the number of processors, the size of each processors memory, the size of cache blocks, and the number of items to be sorted. 1
Machine Models for Query Processing
"... The massive data sets that have to be processed in many application areas are often far too large to fit completely into a computer’s internal memory. When evaluating queries on such large data sets, the resulting communication ..."
Abstract
- Add to MetaCart
The massive data sets that have to be processed in many application areas are often far too large to fit completely into a computer’s internal memory. When evaluating queries on such large data sets, the resulting communication

