## Rank-sensitive data structures (2005)

Venue: | In Proc. 12th International Symposium on String Processing and Information Retrieval (SPIRE), LNCS v. 3772 |

Citations: | 9 - 0 self |

### BibTeX

@INPROCEEDINGS{Bialynicka-birula05rank-sensitivedata,

author = {Iwona Bialynicka-birula and Roberto Grossi},

title = {Rank-sensitive data structures},

booktitle = {In Proc. 12th International Symposium on String Processing and Information Retrieval (SPIRE), LNCS v. 3772},

year = {2005},

pages = {79--90}

}

### OpenURL

### Abstract

Abstract. Output-sensitive data structures result from preprocessing n items and are capable of reporting the items satisfying an on-line query in O(t(n) + ℓ) time, where t(n) is the cost of traversing the structure and ℓ ≤ n is the number of reported items satisfying the query. In this paper we focus on rank-sensitive data structures, which are additionally given a ranking of the n items, so that just the top k best-ranking items should be reported at query time, sorted in rank order, at a cost of O(t(n) + k) time. Note that k is part of the query as a parameter under the control of the user (as opposed to ℓ which is query-dependent). We explore the problem of adding rank-sensitivity to data structures such as suffix trees or range trees, where the ℓ items satisfying the query form O(polylog(n)) intervals of consecutive entries from which we choose the top k best-ranking ones. Letting s(n) be the number of items (including their copies) stored in the original data structures, we increase the space by an additional term of O(s(n) lg ǫ n) memory words of space, each of O(lg n) bits, for any positive constant ǫ < 1. We allow for changing the ranking on the fly during the lifetime of the data structures, with ranking values in 0... O(n). In this case, query time becomes O(t(n)+k) plus O(lg n/lg lg n) per interval; each change in the ranking and each insertion/deletion of an item takes O(lg n) time; the additional term in space occupancy increases to O(s(n) lg n/lg lg n). 1

### Citations

2704 | Authoritative sources in a hyperlinked environment
- Kleinberg
- 1999
(Show Context)
Citation Context ...g the items according to rank, plus the preprocessing cost of D. Attacking the problem. While ranking itself has been the subject of intense theoretical investigation in the context of search engines =-=[17, 18, 24]-=-, we could not find any explicit study pertaining to ranking in the context of data structures. The only published data structure of this kind is the inverted lists [31] in which the documents are sor... |

972 |
O.: Computational Geometry: Algorithms and Applications
- Berg, Kreveld, et al.
- 1997
(Show Context)
Citation Context ...es O(lg n) time; the additional term in space occupancy increases to O(s(n) lg n/lg lg n). 1 Introduction Output-sensitive data structures are at the heart of text searching [13], geometric searching =-=[5]-=-, database searching [28], and information retrieval in general [3, 31]. They are the result of preprocessing n items (these can be textual data, geometric data, database records, multimedia, or any o... |

396 |
Algorithms on strings, trees, and sequences: Computer science and computational biology
- Gusfield
- 1997
(Show Context)
Citation Context ...on/deletion of an item takes O(lg n) time; the additional term in space occupancy increases to O(s(n) lg n/lg lg n). 1 Introduction Output-sensitive data structures are at the heart of text searching =-=[13]-=-, geometric searching [5], database searching [28], and information retrieval in general [3, 31]. They are the result of preprocessing n items (these can be textual data, geometric data, database reco... |

246 | Making data structures persistent
- Driscoll, Sarnak, et al.
- 1989
(Show Context)
Citation Context ...n of [10], in O(k) time, we can retrieve the top k best-ranking items in O(k + lg n) time in unsorted order. Improvements to get O(k) time can be made using scaling [12] or persistent data structures =-=[6, 8, 16]-=-. Subsequent sorting reports the items in O(k lg k) time using O(n) words of memory. What if we adopt the above solution in a real-time setting? Think of a server that provides items in rank order on ... |

221 | SALSA: the stochastic approach for link-structure analysis
- Lempel, Moran
(Show Context)
Citation Context ...g the items according to rank, plus the preprocessing cost of D. Attacking the problem. While ranking itself has been the subject of intense theoretical investigation in the context of search engines =-=[17, 18, 24]-=-, we could not find any explicit study pertaining to ranking in the context of data structures. The only published data structure of this kind is the inverted lists [31] in which the documents are sor... |

176 |
Scaling and related techniques for geometry problems
- Gabow, Bentley, et al.
- 1984
(Show Context)
Citation Context ...he aforementioned e ′ by a variation of [10], in O(k) time, we can retrieve the top k best-ranking items in O(k + lg n) time in unsorted order. Improvements to get O(k) time can be made using scaling =-=[12]-=- or persistent data structures [6, 8, 16]. Subsequent sorting reports the items in O(k lg k) time using O(n) words of memory. What if we adopt the above solution in a real-time setting? Think of a ser... |

176 |
Priority Search Trees
- McCreight
- 1985
(Show Context)
Citation Context ... data structures. The only published data structure of this kind is the inverted lists [31] in which the documents are sorted according to their rank order. McCreight’s paper on priority search trees =-=[19]-=- refers to enumeration in increasing order along the yaxis but it does not indeed discuss how to report the items in sorted order along the y-axis. An indirect form of ranking can be found in the (dyn... |

156 |
Trans-dichotomous algorithms for minimum spanning trees and shortest paths
- Fredman, Willard
- 1994
(Show Context)
Citation Context ...ersions in the worst case. (Also the previous solutions based on persistence, priority search trees and Cartesian trees suffer similar problems in the dynamic setting.) We extend the notion of Q-heap =-=[11]-=- to implement our solution, introducing multi-Q-heaps described in Section 3. 2 The Static Case and its Dynamization Our starting point is a well-known scheme adopted for two-dimensional range trees [... |

136 |
Baeza-Yates , Berthier Ribeiro-Neto, “Modern Information Retrieval
- Ricardo
- 1999
(Show Context)
Citation Context ...o O(s(n) lg n/lg lg n). 1 Introduction Output-sensitive data structures are at the heart of text searching [13], geometric searching [5], database searching [28], and information retrieval in general =-=[3, 31]-=-. They are the result of preprocessing n items (these can be textual data, geometric data, database records, multimedia, or any other kind of data) into O(npolylog(n)) space in such a way, as to allow... |

132 | A functional approach to data structures and its use in multidimensional searching
- Chazelle
- 1988
(Show Context)
Citation Context ...originate,sRank-Sensitive Data Structures 5 respectively, from R(u0) and R(u1). We obtain B(u), a bitstring of |R(u)| bits, totalizing O(nlg n) bits, hence O(n) words of memory, for the entire W (see =-=[4]-=-). Rank query works as expected [5]. Given entries ei and ej in L, we locate their leaves in W, say vi and vj. We find their least common ancestor w in W (the case vi = vj is trivial). On the path fro... |

120 | The string B-tree: a new data structure for string search in external memory and its applications
- Ferragina, Grossi
- 1999
(Show Context)
Citation Context ...w. However, instead of labeling the leaves of the compact trie with the strings (elements) they correspond to, we keep just the trie shape and the skip values contained in its internal nodes, like in =-=[1, 7]-=-. We store the d elements and their satellite data in a separate table. To provide a connection between the trie and the values, we store a permutation which describes the relation between the order o... |

74 | Efficient Algorithms for Document Retrieval Problems
- Muthukrishnan
- 2002
(Show Context)
Citation Context ...s how to report the items in sorted order along the y-axis. An indirect form of ranking can be found in the (dynamic) rectangular intersection with priorities [15] and in the document listing problem =-=[21]-=-. For our class of output-sensitive data structures, we can formulate the ranking problem as a geometric problem. We are given a (dynamic) list L of n entries, where each entry e ∈ L has an associated... |

31 | Optimal external memory interval management
- Arge, Vitter
(Show Context)
Citation Context ...t is larger than the other rank values; multiple copies of +∞ are each different from the other (and take O(lg N) bits each). 2.1 Static case on a single interval We employ a weight-balanced B-tree W =-=[2]-=- as the skeleton structure. At the moment, suppose that W has degree exactly two in the internal nodes and that the n items in L are stored in the leaves of W, assuming that each leaf stores a single ... |

27 |
The design of dynamic data structures, volume 156
- Overmars
(Show Context)
Citation Context ...ibed in Section 3. 2 The Static Case and its Dynamization Our starting point is a well-known scheme adopted for two-dimensional range trees [5]. Following the global rebuilding technique described in =-=[23]-=-, we can restrict our attention to values of n in the range 0...O(N) where n = Θ(N). Consequently, we use lookup tables tailored for N, so that when the value of N must double or halve, we also rebuil... |

26 | Ranking and unranking permutations in linear time
- Myrvold, Ruskey
(Show Context)
Citation Context ...onding leaves in the trie). There are d! possible permutations, so we choose α so that lg d! < 1/4lg N and the encoding on the permutation fits in one word of memory. We use the encoding described in =-=[22]-=-, which takes linear time to rank and unrank a permutation, hence to encode and decode it. 3.3 Multi-Q-heap: Supported operations The Init operation sets up all the lookup tables required for implemen... |

19 |
An optimal algorithm for selection in a min-heap
- Frederickson
- 1993
(Show Context)
Citation Context ...orting these queries, but do not provide items in sorted order (they can end up with half of the items unsorted during their traversal). Since we can identify the aforementioned e ′ by a variation of =-=[10]-=-, in O(k) time, we can retrieve the top k best-ranking items in O(k + lg n) time in unsorted order. Improvements to get O(k) time can be made using scaling [12] or persistent data structures [6, 8, 16... |

18 | Fully-dynamic two dimensional orthogonal range and line segment intersection reporting in logarithmic time - MORTENSEN - 2003 |

16 |
Hash functions for priority queues
- Ajtai, Komlos
- 1984
(Show Context)
Citation Context ...w. However, instead of labeling the leaves of the compact trie with the strings (elements) they correspond to, we keep just the trie shape and the skip values contained in its internal nodes, like in =-=[1, 7]-=-. We store the d elements and their satellite data in a separate table. To provide a connection between the trie and the values, we store a permutation which describes the relation between the order o... |

14 |
Computer Graphics with OpenGL
- Hearn, Baker
- 2003
(Show Context)
Citation Context ...eart of the Google engine, but many other rankings are available for other types of data. Z-order is useful in graphics, since it is the order in which geometrical objects are displayed on the screen =-=[14]-=-. Records in databases can be returned in the order of their physical location (to minimize disk seek time) or according to a time order (e.g. press news). Positions in biological sequences can be ran... |

12 |
Dynamic rectangular intersection with priorities
- Kaplan, Molad, et al.
- 2003
(Show Context)
Citation Context ...g the yaxis but it does not indeed discuss how to report the items in sorted order along the y-axis. An indirect form of ranking can be found in the (dynamic) rectangular intersection with priorities =-=[15]-=- and in the document listing problem [21]. For our class of output-sensitive data structures, we can formulate the ranking problem as a geometric problem. We are given a (dynamic) list L of n entries,... |

10 | Making data structures confluently persistent
- Fiat, Kaplan
- 2001
(Show Context)
Citation Context ...n of [10], in O(k) time, we can retrieve the top k best-ranking items in O(k + lg n) time in unsorted order. Improvements to get O(k) time can be made using scaling [12] or persistent data structures =-=[6, 8, 16]-=-. Subsequent sorting reports the items in O(k lg k) time using O(n) words of memory. What if we adopt the above solution in a real-time setting? Think of a server that provides items in rank order on ... |

1 |
Class notes CSC 2429F: Dynamic data structures
- Fich
- 2003
(Show Context)
Citation Context ... two or three memory words and still supports constant-time operations. Our implementation based on lookup tables is quite simple and does not make use of multiplications or special instructions (see =-=[9, 26]-=- for a thorough discussion of this topic). We first describe a simpler version (to be later extended) supporting the following:s8 Iwona Bialynicka-Birula and Roberto Grossi – Create a heap for a given... |

1 |
2-D spatial indexing scheme in optimal time
- Kitsios, Sioutas, et al.
- 2000
(Show Context)
Citation Context ...n of [10], in O(k) time, we can retrieve the top k best-ranking items in O(k + lg n) time in unsorted order. Improvements to get O(k) time can be made using scaling [12] or persistent data structures =-=[6, 8, 16]-=-. Subsequent sorting reports the items in O(k lg k) time using O(n) words of memory. What if we adopt the above solution in a real-time setting? Think of a server that provides items in rank order on ... |