Results 1  10
of
37
Compressed fulltext indexes
 ACM COMPUTING SURVEYS
, 2007
"... Fulltext indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop indexes that exploit the compressibility of the text, so that their size is a function of the compressed text l ..."
Abstract

Cited by 173 (78 self)
 Add to MetaCart
Fulltext indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop indexes that exploit the compressibility of the text, so that their size is a function of the compressed text length. This concept has evolved into selfindexes, which in addition contain enough information to reproduce any text portion, so they replace the text. The exciting possibility of an index that takes space close to that of the compressed text, replaces it, and in addition provides fast search over it, has triggered a wealth of activity and produced surprising results in a very short time, and radically changed the status of this area in less than five years. The most successful indexes nowadays are able to obtain almost optimal space and search time simultaneously. In this paper we present the main concepts underlying selfindexes. We explain the relationship between text entropy and regularities that show up in index structures and permit compressing them. Then we cover the most relevant selfindexes up to date, focusing on the essential aspects on how they exploit the text compressibility and how they solve efficiently various search problems. We aim at giving the theoretical background to understand and follow the developments in this area.
Succinct Representation of Balanced Parentheses, Static Trees and Planar Graphs
, 1999
"... We consider the implementation of abstract data types for the static objects: binary tree, rooted ordered tree and balanced parenthesis expression. Our representations use an amount of space within a lower order term of the information theoretic minimum and support, in constant time, a richer set ..."
Abstract

Cited by 142 (9 self)
 Add to MetaCart
We consider the implementation of abstract data types for the static objects: binary tree, rooted ordered tree and balanced parenthesis expression. Our representations use an amount of space within a lower order term of the information theoretic minimum and support, in constant time, a richer set of navigational operations than has previously been considered in similar work. In the case of binary trees, for instance, we can move from a node to its left or right child or to the parent in constant time while retaining knowledge of the size of the subtree at which we are positioned. The approach is applied to produce succinct representation of planar graphs in which one can test adjacency in constant time. Keywords: abstract data type, succinct representation, binary trees, balanced parenthesis, rooted ordered trees, planar graphs. AMS subject classifications: 68P05, 68Q65 1 Introduction The binary tree is among the most fundamental of data structures. While it is often the c...
Optimal Bounds for the Predecessor Problem
 In Proceedings of the ThirtyFirst Annual ACM Symposium on Theory of Computing
"... We obtain matching upper and lower bounds for the amount of time to find the predecessor of a given element among the elements of a fixed efficiently stored set. Our algorithms are for the unitcost wordlevel RAM with multiplication and extend to give optimal dynamic algorithms. The lower bounds ar ..."
Abstract

Cited by 63 (0 self)
 Add to MetaCart
We obtain matching upper and lower bounds for the amount of time to find the predecessor of a given element among the elements of a fixed efficiently stored set. Our algorithms are for the unitcost wordlevel RAM with multiplication and extend to give optimal dynamic algorithms. The lower bounds are proved in a much stronger communication game model, but they apply to the cell probe and RAM models and to both static and dynamic predecessor problems.
Exact and Approximate Distances in Graphs  a survey
 In ESA
, 2001
"... We survey recent and not so recent results related to the computation of exact and approximate distances, and corresponding shortest, or almost shortest, paths in graphs. We consider many different settings and models and try to identify some remaining open problems. ..."
Abstract

Cited by 57 (0 self)
 Add to MetaCart
We survey recent and not so recent results related to the computation of exact and approximate distances, and corresponding shortest, or almost shortest, paths in graphs. We consider many different settings and models and try to identify some remaining open problems.
Undirected Single Source Shortest Paths in Linear Time
 J. Assoc. Comput. Mach
, 1997
"... The single source shortest paths problem (SSSP) is one of the classic problems in algorithmic graph theory: given a weighted graph G with a source vertex s, find the shortest path from s to all other vertices in the graph. Since 1959 all theoretical developments in SSSP have been based on Dijkstra' ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
The single source shortest paths problem (SSSP) is one of the classic problems in algorithmic graph theory: given a weighted graph G with a source vertex s, find the shortest path from s to all other vertices in the graph. Since 1959 all theoretical developments in SSSP have been based on Dijkstra's algorithm, visiting the vertices in order of increasing distance from s. Thus, any implementation of Dijkstra 's algorithm sorts the vertices according to their distances from s. However, we do not know how to sort in linear time. Here, a deterministic linear time and linear space algorithm is presented for the undirected single source shortest paths problem with integer weights. The algorithm avoids the sorting bottleneck by building a hierechical bucketing structure, identifying vertex pairs that may be visited in any order. 1 Introduction Let G = (V; E), jV j = n, jEj = m, be an undirected connected graph with an integer edge weight function ` : E ! N and a distinguished source vertex...
Faster Deterministic Sorting and Searching in Linear Space
, 1995
"... We present a significant improvement on linear space deterministic sorting and searching. On a unitcost RAM with word size w, an ordered set of n wbit keys (viewed as binary strings or integers) can be maintained in O ` min ` p log n; log n log w + log log n; log w log log n " time per op ..."
Abstract

Cited by 39 (7 self)
 Add to MetaCart
We present a significant improvement on linear space deterministic sorting and searching. On a unitcost RAM with word size w, an ordered set of n wbit keys (viewed as binary strings or integers) can be maintained in O ` min ` p log n; log n log w + log log n; log w log log n " time per operation, including insert, delete, member search, and neighbour search. The cost for searching is worstcase while the cost for updates is amortized. For range queries, there is an additional cost of reporting the found keys. As an application, n keys can be sorted in linear space at a worstcase cost of O \Gamma n p log n \Delta . The best previous method for deterministic sorting and searching in linear space has been the fusion trees which supports queries in O(logn= log log n) amortized time and sorting in O(n log n= log log n) worstcase time. We also make two minor observations on adapting our data structure to the input distribution and on the complexity of perfect hashing. 1 I...
A Simple Shortest Path Algorithm with Linear Average Time
"... We present a simple shortest path algorithm. If the input lengths are positive and uniformly distributed, the algorithm runs in linear time. The worstcase running time of the algorithm is O(m + n log C), where n and m are the number of vertices and arcs of the input graph, respectively, and C i ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
We present a simple shortest path algorithm. If the input lengths are positive and uniformly distributed, the algorithm runs in linear time. The worstcase running time of the algorithm is O(m + n log C), where n and m are the number of vertices and arcs of the input graph, respectively, and C is the ratio of the largest and the smallest nonzero arc length.
SingleSource ShortestPaths on Arbitrary Directed Graphs in Linear AverageCase Time
 In Proc. 12th ACMSIAM Symposium on Discrete Algorithms
, 2001
"... The quest for a lineartime singlesource shortestpath (SSSP) algorithm on directed graphs with positive edge weights is an ongoing hot research topic. While Thorup recently found an O(n + m) time RAM algorithm for undirected graphs with n nodes, m edges and integer edge weights in f0; : : : ; 2 w ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
The quest for a lineartime singlesource shortestpath (SSSP) algorithm on directed graphs with positive edge weights is an ongoing hot research topic. While Thorup recently found an O(n + m) time RAM algorithm for undirected graphs with n nodes, m edges and integer edge weights in f0; : : : ; 2 w 1g where w denotes the word length, the currently best time bound for directed sparse graphs on a RAM is O(n + m log log n). In the present paper we study the averagecase complexity of SSSP. We give a simple algorithm for arbitrary directed graphs with random edge weights uniformly distributed in [0; 1] and show that it needs linear time O(n + m) with high probability. 1 Introduction The singlesource shortestpath problem (SSSP) is a fundamental and wellstudied combinatorial optimization problem with many practical and theoretical applications [1]. Let G = (V; E) be a directed graph, jV j = n, jEj = m, let s be a distinguished vertex of the graph, and c be a function assigning a n...
Dynamic Ordered Sets with Exponential Search Trees
 Combination of results presented in FOCS 1996, STOC 2000 and SODA
, 2001
"... We introduce exponential search trees as a novel technique for converting static polynomial space search structures for ordered sets into fullydynamic linear space data structures. This leads to an optimal bound of O ( √ log n/log log n) for searching and updating a dynamic set of n integer keys i ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
We introduce exponential search trees as a novel technique for converting static polynomial space search structures for ordered sets into fullydynamic linear space data structures. This leads to an optimal bound of O ( √ log n/log log n) for searching and updating a dynamic set of n integer keys in linear space. Here searching an integer y means finding the maximum key in the set which is smaller than or equal to y. This problem is equivalent to the standard text book problem of maintaining an ordered set (see, e.g., Cormen, Leiserson, Rivest, and Stein: Introduction to Algorithms, 2nd ed., MIT Press, 2001). The best previous deterministic linear space bound was O(log n/log log n) due Fredman and Willard from STOC 1990. No better deterministic search bound was known using polynomial space.