Results 1  10
of
39
On the limits of cacheobliviousness
 IN PROC. 35TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 2003
"... In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, ..."
Abstract

Cited by 41 (8 self)
 Add to MetaCart
(Show Context)
In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, not even in the presence of a tall cache assumption.Our results for sorting show the existence of an inherent tradeoff in the cacheoblivious model between the strength of the tall cache assumption and the overhead for the case M >> B, and show that Funnelsort and recursive binary mergesort are optimal algorithms in the sense that they attain this tradeoff.
A skip list cookbook
, 1990
"... Skip lists are a probabilistic data structure that seem likely to supplant balanced trees as the implementation method of choice for many applications. Skip list algorithms have the same asymptotic expected time bounds as balanced trees and are simpler, faster and use less space. The original paper ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
Skip lists are a probabilistic data structure that seem likely to supplant balanced trees as the implementation method of choice for many applications. Skip list algorithms have the same asymptotic expected time bounds as balanced trees and are simpler, faster and use less space. The original paper on skip lists only presented algorithms for search, insertion and deletion. In this paper, we show that skip lists are as versatile as balanced trees. We describe and analyze algorithms to use search fingers, merge, split and concatenate skip lists, and implement linear list operations using skip lists. The skip list algorithms for these actions are faster and simpler than their balanced tree cousins. The merge algorithm for skip lists we describe has better asymptotic time complexity than any previously described merge algorithm for balanced trees.
Finding maximal pairs with bounded gap
 Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching (CPM), volume 1645 of Lecture Notes in Computer Science
, 1999
"... A pair in a string is the occurrence of the same substring twice. A pair is maximal if the two occurrences of the substring cannot be extended to the left and right without making them different. The gap of a pair is the number of characters between the two occurrences of the substring. In this pape ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
A pair in a string is the occurrence of the same substring twice. A pair is maximal if the two occurrences of the substring cannot be extended to the left and right without making them different. The gap of a pair is the number of characters between the two occurrences of the substring. In this paper we present methods for finding all maximal pairs under various constraints on the gap. In a string of length n we can find all maximal pairs with gap in an upper and lower bounded interval in time O(n log n + z) where z is the number of reported pairs. If the upper bound is removed the time reduces to O(n+z). Since a tandem repeat is a pair where the gap is zero, our methods can be seen as a generalization of finding tandem repeats. The running time of our methods equals the running time of well known methods for finding tandem repeats.
SpaceEfficient Framework for Topk String Retrieval Problems
"... Given a set D = {d1, d2,..., dD} of D strings of total length n, our task is to report the “most relevant” strings for a given query pattern P. This involves somewhat more advanced query functionality than the usual pattern matching, as some notion of “most relevant” is involved. In information retr ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
(Show Context)
Given a set D = {d1, d2,..., dD} of D strings of total length n, our task is to report the “most relevant” strings for a given query pattern P. This involves somewhat more advanced query functionality than the usual pattern matching, as some notion of “most relevant” is involved. In information retrieval literature, this task is best achieved by using inverted indexes. However, inverted indexes work only for some predefined set of patterns. In the pattern matching community, the most popular patternmatching data structures are suffix trees and suffix arrays. However, a typical suffix tree search involves going through all the occurrences of the pattern over the entire string collection, which might be a lot more than the required relevant documents. The first formal framework to study such kind of retrieval problems was given by Muthukrishnan [25]. He considered two metrics for relevance: frequency and proximity. He took a thresholdbased approach on these metrics and gave data structures taking O(n log n) words of space. We study this problem in a slightly different framework of reporting the top k most relevant documents (in sorted order) under similar and more general relevance metrics. Our framework gives linear space data structure with optimal query times for arbitrary score functions. As a corollary, it improves the space utilization for the problems in [25] while maintaining optimal query performance. We also develop compressed variants of these data structures for several specific relevance metrics.
A fast algorithm for optimal buffer insertion
 IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems
, 2005
"... Abstract—The classic buffer insertion algorithm of van Ginneken has time and space complexity ( 2), where is the number of possible buffer positions. For more than a decade, van Ginneken’s algorithm has been the foundation of buffer insertion. In this paper, we present a new algorithm that computes ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
Abstract—The classic buffer insertion algorithm of van Ginneken has time and space complexity ( 2), where is the number of possible buffer positions. For more than a decade, van Ginneken’s algorithm has been the foundation of buffer insertion. In this paper, we present a new algorithm that computes the same optimal buffer insertion, but runs much faster. For 2pin nets, our time complexity is ( log) and space complexity is (). For multipin nets, our time complexity is ( log2) and space complexity is ( log). The speedup is achieved by four novel techniques: predictive pruning, candidate tree, fast redundancy check, and fast merging. On industrial test cases, the new algorithms is 2–80 times faster than van Ginneken’s algorithm and uses 1 4–1 500 of the memory. Since van Ginneken’s algorithm and its variations are used by most existing algorithms on buffer insertion and buffer sizing, our new algorithm significantly improves the performance of all these algorithms. The predictive pruning technique has been applied to buffer cost minimization (Shi et al., 2004), and significantly improved the running time. Index Terms—Buffer insertion, data structure, Elmore delay, interconnect, routing. I.
Unimodal Regression via Prefix Isotonic Regression
"... This paper gives optimal algorithms for determining realvalued univariate unimodal regressions, that is, for determining the optimal regression which is increasing and then decreasing. Such regressions arise in a wide variety of applications. They are shapeconstrained nonparametric regressions, cl ..."
Abstract

Cited by 10 (7 self)
 Add to MetaCart
(Show Context)
This paper gives optimal algorithms for determining realvalued univariate unimodal regressions, that is, for determining the optimal regression which is increasing and then decreasing. Such regressions arise in a wide variety of applications. They are shapeconstrained nonparametric regressions, closely related to isotonic regression. For unimodal regression on n weighted points our algorithm for the L_2 metric requires only &Theta;(n) time, while for the L_1 metric it requires &Theta;(n log n) time. For unweighted points our algorithm for the L_1 metric also requires only &Theta;(n) time. Previous algorithms were for the L_2 metric and required &Omega;(n&sup2;) time. All previous algorithms used multiple calls to isotonic regression, and our major contribution is to organize these into a prefix isotonic regression, determining the regression on all initial segments. The prefix approach reduces the total time required by utilizing the solution for one initial segment to solve the next.
Spaceefficient finger search on degreebalanced search trees
 In SODA
, 2003
"... We show how to support the finger search operation on degreebalanced search trees in a spaceefficient manner that retains a worstcase time bound of O(log d), where d is the difference in rank between successive search targets. While most existing treebased designs allocate linear extra storage i ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We show how to support the finger search operation on degreebalanced search trees in a spaceefficient manner that retains a worstcase time bound of O(log d), where d is the difference in rank between successive search targets. While most existing treebased designs allocate linear extra storage in the nodes (e.g., for side links and parent pointers), our design maintains a compact auxiliary data structure called the “hand ” during the lifetime of the tree and imposes no other storage requirement within the tree. The hand requires O(log n) space for an nnode tree and has a relatively simple structure. It can be updated synchronously during insertions and deletions with time proportional to the number of structural changes in the tree. The auxiliary nature of the hand also makes it possible to introduce finger searches into any existing implementation without modifying the underlying data representation (e.g., any implementation of RedBlack trees can be used). Together these factors make finger searches more appealing in practice. Our design also yields a simple yet optimal inorder walk algorithm with worstcase O(1) work per increment (again without any extra storage requirement in the nodes), and we believe our algorithm can be used in database applications when the overall performance is very sensitive to retrieval latency. 1
Solving the string statistics problem in time O(n log n)
 Proc. 29th International Colloquium on Automata, Languages, and Programming
, 2002
"... The string statistics problem consists of preprocessing a string of length n such that given a query pattern of length m, the maximum number of nonoverlapping occurrences of the query pattern in the string can be reported efficiently... ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
The string statistics problem consists of preprocessing a string of length n such that given a query pattern of length m, the maximum number of nonoverlapping occurrences of the query pattern in the string can be reported efficiently...
Worst case optimal unionintersection expression evaluation
 In Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP ’05), volume 3580 of Lecture Notes in Computer Science
, 2005
"... addresses: ..."
(Show Context)
Highly Parallel Sparse MatrixMatrix Multiplication
, 2010
"... Generalized sparse matrixmatrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an unbounded number of processors. Our algorithms are based on ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
Generalized sparse matrixmatrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an unbounded number of processors. Our algorithms are based on twodimensional block distribution of sparse matrices where serial sections use a novel hypersparse kernel for scalability. We give a stateoftheart MPI implementation of one of our algorithms. Our experiments show scaling up to thousands of processors on a variety of test scenarios.