Results 1  10
of
10
BestEffort Cache Synchronization with Source Cooperation
 IN SIGMOD
, 2002
"... In environments where exact synchronization between source data objects and cached copies is not achievable due to bandwidth or other resource constraints, stale (outofdate) copies are permitted. It is desirable to minimize the overall divergence between source objects and cached copies by sele ..."
Abstract

Cited by 65 (3 self)
 Add to MetaCart
In environments where exact synchronization between source data objects and cached copies is not achievable due to bandwidth or other resource constraints, stale (outofdate) copies are permitted. It is desirable to minimize the overall divergence between source objects and cached copies by selectively refreshing modified objects. We call the online process of selecting which objects to refresh in order to minimize divergence besteffort synchronization. In most approaches to besteffort synchronization, the cache coordinates the process and selects objects to refresh. In this paper, we propose a besteffort synchronization scheduling policy that exploits cooperation between data sources and the cache. We also propose an implementation of our policy that incurs low communication overhead even in environments with very large numbers of sources. Our algorithm is adaptive to wide fluctuations in available resources and data update rates. Through experimental simulation over synthetic and realworld data, we demonstrate the effectiveness of our algorithm, and we quantify the significant decrease in divergence achievable with source cooperation.
A Parallelization of Dijkstra's Shortest Path Algorithm
 IN PROC. 23RD MFCS'98, LECTURE NOTES IN COMPUTER SCIENCE
, 1998
"... The single source shortest path (SSSP) problem lacks parallel solutions which are fast and simultaneously workefficient. We propose simple criteria which divide Dijkstra's sequential SSSP algorithm into a number of phases, such that the operations within a phase can be done in parallel. We give a P ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
The single source shortest path (SSSP) problem lacks parallel solutions which are fast and simultaneously workefficient. We propose simple criteria which divide Dijkstra's sequential SSSP algorithm into a number of phases, such that the operations within a phase can be done in parallel. We give a PRAM algorithm based on these criteria and analyze its performance on random digraphs with random edge weights uniformly distributed in [0, 1]. We use
A shortest path algorithm for realweighted undirected graphs
 in 13th ACMSIAM Symp. on Discrete Algs
, 1985
"... Abstract. We present a new scheme for computing shortest paths on realweighted undirected graphs in the fundamental comparisonaddition model. In an efficient preprocessing phase our algorithm creates a linearsize structure that facilitates singlesource shortest path computations in O(m log α) ti ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
Abstract. We present a new scheme for computing shortest paths on realweighted undirected graphs in the fundamental comparisonaddition model. In an efficient preprocessing phase our algorithm creates a linearsize structure that facilitates singlesource shortest path computations in O(m log α) time, where α = α(m, n) is the very slowly growing inverseAckermann function, m the number of edges, and n the number of vertices. As special cases our algorithm implies new bounds on both the allpairs and singlesource shortest paths problems. We solve the allpairs problem in O(mnlog α(m, n)) time and, if the ratio between the maximum and minimum edge lengths is bounded by n (log n)O(1) , we can solve the singlesource problem in O(m + nlog log n) time. Both these results are theoretical improvements over Dijkstra’s algorithm, which was the previous best for real weighted undirected graphs. Our algorithm takes the hierarchybased approach invented by Thorup. Key words. singlesource shortest paths, allpairs shortest paths, undirected graphs, Dijkstra’s
An experimental study of a parallel shortest path algorithm for solving largescale graph instances
 Ninth Workshop on Algorithm Engineering and Experiments (ALENEX 2007)
, 2007
"... We present an experimental study of the single source shortest path problem with nonnegative edge weights (NSSP) on largescale graphs using the $\Delta$stepping parallel algorithm. We report performance results on the Cray MTA2, a multithreaded parallel computer. The MTA2 is a highend shared m ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
We present an experimental study of the single source shortest path problem with nonnegative edge weights (NSSP) on largescale graphs using the $\Delta$stepping parallel algorithm. We report performance results on the Cray MTA2, a multithreaded parallel computer. The MTA2 is a highend shared memory system offering two unique features that aid the efficient parallel implementation of irregular algorithms: the ability to exploit finegrained parallelism, and lowoverhead synchronization primitives. Our implementation exhibits remarkable parallel speedup when compared with competitive sequential algorithms, for lowdiameter sparse graphs. For instance, $\Delta$stepping on a directed scalefree graph of 100 million vertices and 1 billion edges takes less than ten seconds on 40 processors of the MTA2, with a relative speedup of close to 30. To our knowledge, these are the first performance results of a shortest path problem on realistic graph instances in the order of billions of vertices and edges.
Parallel Shortest Path Algorithms for Solving . . .
, 2006
"... We present an experimental study of the single source shortest path problem with nonnegative edge weights (NSSP) on largescale graphs using the ∆stepping parallel algorithm. We report performance results on the Cray MTA2, a multithreaded parallel computer. The MTA2 is a highend shared memory s ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
We present an experimental study of the single source shortest path problem with nonnegative edge weights (NSSP) on largescale graphs using the ∆stepping parallel algorithm. We report performance results on the Cray MTA2, a multithreaded parallel computer. The MTA2 is a highend shared memory system offering two unique features that aid the efficient parallel implementation of irregular algorithms: the ability to exploit finegrained parallelism, and lowoverhead synchronization primitives. Our implementation exhibits remarkable parallel speedup when compared with competitive sequential algorithms, for lowdiameter sparse graphs. For instance, ∆stepping on a directed scalefree graph of 100 million vertices and 1 billion edges takes less than ten seconds on 40 processors of the MTA2, with a relative speedup of close to 30. To our knowledge, these are the first performance results of a shortest path problem on realistic graph instances in the order of billions of vertices and edges.
Buckets strike back: Improved Parallel ShortestPaths
 Proc. 16th Intl. Par. Distr. Process. Symp. (IPDPS
, 2002
"... We study the averagecase complexity of the parallel singlesource shortestpath (SSSP) problem, assuming arbitrary directed graphs with n nodes, m edges, and independent random edge weights uniformly distributed in [0; 1]. We provide a new bucketbased parallel SSSP algorithm that runs in T = O(log ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
We study the averagecase complexity of the parallel singlesource shortestpath (SSSP) problem, assuming arbitrary directed graphs with n nodes, m edges, and independent random edge weights uniformly distributed in [0; 1]. We provide a new bucketbased parallel SSSP algorithm that runs in T = O(log 2 n min i f2 i L + jV i jg) averagecase time using O(n+m+T ) work on a PRAM where L denotes the maximum shortestpath weight and jV i j is the number of graph vertices with indegree at least 2 i . All previous algorithms either required more time or more work. The minimum performance gain is a logarithmic factor improvement; on certain graph classes, accelerations by factors of more than n 0:4 can be achieved. The algorithm allows adaptation to distributed memory machines, too.
Directed SingleSource ShortestPaths in Linear AverageCase Time
, 2001
"... The quest for a lineartime singlesource shortestpath (SSSP) algorithm on directed graphs with positive edge weights is an ongoing hot research topic. While Thorup recently found an O(n + m) time RAM algorithm for undirected graphs with n nodes, m edges and integer edge weights in f0; : : : ; 2 ..."
Abstract
 Add to MetaCart
The quest for a lineartime singlesource shortestpath (SSSP) algorithm on directed graphs with positive edge weights is an ongoing hot research topic. While Thorup recently found an O(n + m) time RAM algorithm for undirected graphs with n nodes, m edges and integer edge weights in f0; : : : ; 2 1g where w denotes the word length, the currently best time bound for directed sparse graphs on a RAM is O(n +m log log n).
ΔStepping: A Parallel Single Sourche Shortest . . .
 IN ESA ’98: PROCEEDINGS OF THE 6TH ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS
, 1998
"... In spite of intensive research, little progress has been made towards fast and workefficient parallel algorithms for the single source shortest path problem. Our \Deltastepping algorithm, a generalization of Dial's algorithm and the BellmanFord algorithm, improves this situation at least in t ..."
Abstract
 Add to MetaCart
In spite of intensive research, little progress has been made towards fast and workefficient parallel algorithms for the single source shortest path problem. Our \Deltastepping algorithm, a generalization of Dial's algorithm and the BellmanFord algorithm, improves this situation at least in the following "averagecase" sense: For random directed graphs with edge probability n and uniformly distributed edge weights a PRAM version works in expected time O using linear work. The algorithm also allows for efficient adaptation to distributed memory machines. Implementations show that our approach works on real machines. As a side effect, we get a simple linear time sequential algorithm for a large class of not necessarily random directed graphs with random edge weights.
A Study of Different Parallel Implementations of Single Source Shortest Path Algorithms
"... We present a study of parallel implementations of single source shortest path (SSSP) algorithms. In the last three decades number of parallel SSSP algorithms have been developed and implemented on the different type of machines. We have divided some of these implementations into two groups, first ar ..."
Abstract
 Add to MetaCart
We present a study of parallel implementations of single source shortest path (SSSP) algorithms. In the last three decades number of parallel SSSP algorithms have been developed and implemented on the different type of machines. We have divided some of these implementations into two groups, first are those where parallelization is achieved in the internal operations of sequential SSSP algorithm and second are where an actual graph is divided into subgraphs, and serial SSSP algorithm executes parallel on separate processing units for each subgraph. These parallel implementations have used PRAM, CRAY supercomputer, dynamically reconfigurable processor and Graphics processing unit as platform to run them.
Recursive Design of Hardware Priority Queues
"... A recursive and fast construction of an n elements priority queue from exponentially smaller hardware priority queues and size n RAM is presented. All priority queue implementations to date either require O(log n) instructions per operation or exponential (with key size) space or expensive special h ..."
Abstract
 Add to MetaCart
A recursive and fast construction of an n elements priority queue from exponentially smaller hardware priority queues and size n RAM is presented. All priority queue implementations to date either require O(log n) instructions per operation or exponential (with key size) space or expensive special hardware whose cost and latency dramatically increases with the priority queue size. Hence constructing a priority queue (PQ) from considerably smaller hardware priority queues (which are also much faster) while maintaining the O(1) steps per PQ operation is critical. Here we present such an acceleration technique called the Power Priority Queue (PPQ) technique. Specifically, an n elements PPQ is constructed from 2k − 1 primitive priority queues of size k √ n (k = 2, 3,...) and a RAM of size n, where the throughput of the construct beats that of a single, size n primitive hardware priority queue. For example an n elements PQ can be constructed from either three √ n or five 3 √ n primitive H/W priority queues. Applying our technique to a TCAM based priority queue, results in TCAMPPQ, a scalable perfect line rate fair queuing of millions of concurrent connections at speeds of 100 Gbps. This demonstrates the benefits of our scheme when used with hardware TCAM, we expect similar results with systolic arrays, shiftregisters and similar technologies. As a by product of our technique we present an O(n) time sorting algorithm in a system equipped with a O(w √ n) entries TCAM, where here n is the number of items, and w is the maximum number of bits required to represent an item, improving on a previous result that used an Ω(n) entries TCAM. Finally, we provide a lower bound on the time complexity of sorting n elements with TCAM of size O(n) that matches our TCAM based sorting algorithm.