Results 1  10
of
128
Memory Coherence in Shared Virtual Memory Systems
, 1989
"... This paper studies the memory coherence problem in designing said inaplementing a shared virtual memory on looselycoupled multiprocessors. Two classes of aIgoritb. ms for solving the problem are presented. A prototype shared virtual memory on an Apollo ring has been implemented based on these a ..."
Abstract

Cited by 957 (17 self)
 Add to MetaCart
This paper studies the memory coherence problem in designing said inaplementing a shared virtual memory on looselycoupled multiprocessors. Two classes of aIgoritb. ms for solving the problem are presented. A prototype shared virtual memory on an Apollo ring has been implemented based on these algorithms. Both theoretical and practical results show tkat the mentory coherence problem cast indeed be solved efficiently on a looselycoupled multiprocessor.
Fibonacci Heaps and Their Uses in Improved Network optimization algorithms
, 1987
"... In this paper we develop a new data structure for implementing heaps (priority queues). Our structure, Fibonacci heaps (abbreviated Fheaps), extends the binomial queues proposed by Vuillemin and studied further by Brown. Fheaps support arbitrary deletion from an nitem heap in qlogn) amortized tim ..."
Abstract

Cited by 739 (18 self)
 Add to MetaCart
In this paper we develop a new data structure for implementing heaps (priority queues). Our structure, Fibonacci heaps (abbreviated Fheaps), extends the binomial queues proposed by Vuillemin and studied further by Brown. Fheaps support arbitrary deletion from an nitem heap in qlogn) amortized time and all other standard heap operations in o ( 1) amortized time. Using Fheaps we are able to obtain improved running times for several network optimization algorithms. In particular, we obtain the following worstcase bounds, where n is the number of vertices and m the number of edges in the problem graph: ( 1) O(n log n + m) for the singlesource shortest path problem with nonnegative edge lengths, improved from O(m logfmh+2)n); (2) O(n*log n + nm) for the allpairs shortest path problem, improved from O(nm lo&,,,+2,n); (3) O(n*logn + nm) for the assignment problem (weighted bipartite matching), improved from O(nm log0dn+2)n); (4) O(mj3(m, n)) for the minimum spanning tree problem, improved from O(mloglo&,,.+2,n), where j3(m, n) = min {i 1 log % 5 m/n). Note that B(m, n) 5 log*n if m 2 n. Of these results, the improved bound for minimum spanning trees is the most striking, although all the
Selfadjusting binary search trees
, 1985
"... The splay tree, a selfadjusting form of binary search tree, is developed and analyzed. The binary search tree is a data structure for representing tables and lists so that accessing, inserting, and deleting items is easy. On an nnode splay tree, all the standard search tree operations have an am ..."
Abstract

Cited by 432 (18 self)
 Add to MetaCart
The splay tree, a selfadjusting form of binary search tree, is developed and analyzed. The binary search tree is a data structure for representing tables and lists so that accessing, inserting, and deleting items is easy. On an nnode splay tree, all the standard search tree operations have an amortized time bound of O(log n) per operation, where by “amortized time ” is meant the time per operation averaged over a worstcase sequence of operations. Thus splay trees are as efficient as balanced trees when total running time is the measure of interest. In addition, for sufficiently long access sequences, splay trees are as efficient, to within a constant factor, as static optimum search trees. The efftciency of splay trees comes not from an explicit structural constraint, as with balanced trees, but from applying a simple restructuring heuristic, called splaying, whenever the tree is accessed. Extensions of splaying give simplified forms of two other data structures: lexicographic or multidimensional search trees and link/ cut trees.
A Data Structure for Dynamic Trees
, 1983
"... A data structure is proposed to maintain a collection of vertexdisjoint trees under a sequence of two kinds of operations: a link operation that combines two trees into one by adding an edge, and a cut operation that divides one tree into two by deleting an edge. Each operation requires O(log n) ti ..."
Abstract

Cited by 347 (21 self)
 Add to MetaCart
A data structure is proposed to maintain a collection of vertexdisjoint trees under a sequence of two kinds of operations: a link operation that combines two trees into one by adding an edge, and a cut operation that divides one tree into two by deleting an edge. Each operation requires O(log n) time. Using this data structure, new fast algorithms are obtained for the following problems: (1) Computing nearest common ancestors. (2) Solving various network flow problems including finding maximum flows, blocking flows, and acyclic flows. (3) Computing certain kinds of constrained minimum spanning trees. (4) Implementing the network simplex algorithm for minimumcost flows. The most significant application is (2); an O(mn log n)time algorithm is obtained to find a maximum flow in a network of n vertices and m edges, beating by a factor of log n the fastest algorithm previously known for sparse graphs.
The Watershed Transform: Definitions, Algorithms and Parallelization Strategies
, 2001
"... The watershed transform is the method of choice for image segmentation in the field of mathematical morphology. We present a critical review of several definitions of the watershed transform and the associated sequential algorithms, and discuss various issues which often cause confusion in the li ..."
Abstract

Cited by 226 (3 self)
 Add to MetaCart
The watershed transform is the method of choice for image segmentation in the field of mathematical morphology. We present a critical review of several definitions of the watershed transform and the associated sequential algorithms, and discuss various issues which often cause confusion in the literature. The need to distinguish between definition, algorithm specification and algorithm implementation is pointed out. Various examples are given which illustrate differences between watershed transforms based on different definitions and/or implementations. The second part of the paper surveys approaches for parallel implementation of sequential watershed algorithms.
GraphChi: Largescale Graph Computation On just a PC
 In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, OSDI’12
, 2012
"... Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains c ..."
Abstract

Cited by 115 (6 self)
 Add to MetaCart
(Show Context)
Current systems for graph computation require a distributed computing cluster to handle very large realworld problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to nonexperts. In this work, we present GraphChi, a diskbased system for computing efficiently on graphs with billions of edges. By using a wellknown method to break large graphs into small parts, and a novel parallel sliding windows method, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumerlevel computer. We further extend GraphChi to support graphs that evolve over time, and demonstrate that, on a single computer, GraphChi can process over one hundred thousand graph updates per second, while simultaneously performing computation. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives. By repeating experiments reported for existing distributed systems, we show that, with only fraction of the resources, GraphChi can solve the same problems in very reasonable time. Our work makes largescale graph computation available to anyone with a modern PC. 1
A Functional Approach to External Graph Algorithms
 Algorithmica
, 1998
"... . We present a new approach for designing external graph algorithms and use it to design simple external algorithms for computing connected components, minimum spanning trees, bottleneck minimum spanning trees, and maximal matchings in undirected graphs and multigraphs. Our I/O bounds compete w ..."
Abstract

Cited by 104 (2 self)
 Add to MetaCart
(Show Context)
. We present a new approach for designing external graph algorithms and use it to design simple external algorithms for computing connected components, minimum spanning trees, bottleneck minimum spanning trees, and maximal matchings in undirected graphs and multigraphs. Our I/O bounds compete with those of previous approaches. Unlike previous approaches, ours is purely functionalwithout side effectsand is thus amenable to standard checkpointing and programming language optimization techniques. This is an important practical consideration for applications that may take hours to run. 1 Introduction We present a divideandconquer approach for designing external graph algorithms, i.e., algorithms on graphs that are too large to fit in main memory. Our approach is simple to describe and implement: it builds a succession of graph transformations that reduce to sorting, selection, and a recursive bucketing technique. No sophisticated data structures are needed. We apply our t...
The Computational Power and Complexity of Constraint Handling Rules
 In Second Workshop on Constraint Handling Rules, at ICLP05
, 2005
"... Constraint Handling Rules (CHR) is a highlevel rulebased programming language which is increasingly used for general purposes. We introduce the CHR machine, a model of computation based on the operational semantics of CHR. Its computational power and time complexity properties are compared to thos ..."
Abstract

Cited by 52 (21 self)
 Add to MetaCart
Constraint Handling Rules (CHR) is a highlevel rulebased programming language which is increasingly used for general purposes. We introduce the CHR machine, a model of computation based on the operational semantics of CHR. Its computational power and time complexity properties are compared to those of the wellunderstood Turing machine and Random Access Memory machine. This allows us to prove the interesting result that every algorithm can be implemented in CHR with the best known time and space complexity. We also investigate the practical relevance of this result and the constant factors involved. Finally we expand the scope of the discussion to other (declarative) programming languages.
Waitfree Parallel Algorithms for the UnionFind Problem
 In Proc. 23rd ACM Symposium on Theory of Computing
, 1994
"... We are interested in designing efficient data structures for a shared memory multiprocessor. In this paper we focus on the UnionFind data structure. We consider a fully asynchronous model of computation where arbitrary delays are possible. Thus we require our solutions to the data structure problem ..."
Abstract

Cited by 51 (0 self)
 Add to MetaCart
(Show Context)
We are interested in designing efficient data structures for a shared memory multiprocessor. In this paper we focus on the UnionFind data structure. We consider a fully asynchronous model of computation where arbitrary delays are possible. Thus we require our solutions to the data structure problem have the waitfree property, meaning that each thread continues to make progress on its operations, independent of the speeds of the other threads. In this model efficiency is best measured in terms of the total number of instructions used to perform a sequence of data structure operations, the work performed by the processors. We give a waitfree implementation of an efficient algorithm for UnionFind. In addition we show that the worst case performance of the algorithm can be improved by simulating a synchronized algorithm, or by simulating a larger machine if the data structure requests support sufficient parallelism. Our solutions apply to a much more general adversary model than has be...