Results 1 - 10
of
26
Fast and accurate estimation of shortest paths in large graphs
- In Proceedings of Conference on Information and Knowledge Management (CIKM
, 2010
"... Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual short ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
(Show Context)
Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual shortest paths (i.e. the sequence of nodes involved) is often neglected. However, in applications arising in massive online social networks, biological networks, and knowledge graphs it is often essential to find out many, if not all, shortest paths between two given nodes. In this paper, we address this problem and present a scalable sketch-based index structure that not only supports estimation of node distances, but also computes corresponding shortest paths themselves. Generating the actual path information allows for further improvements to the estimation accuracy of distances (and paths), leading to near-exact shortest-path approximations in real world graphs. We evaluate our techniques – implemented within a fully functional RDF graph database system – over large realworld social and biological networks of sizes ranging from tens of thousand to millions of nodes and edges. Experiments on several datasets show that we can achieve query response times providing several orders of magnitude speedup over traditional path computations while keeping the estimation errors between 0 % and 1 % on average.
Range Selection and Median: Tight Cell Probe Lower Bounds and Adaptive Data Structures
"... Range selection is the problem of preprocessing an input array A of n unique integers, such that given a query (i, j, k), one can report the k’th smallest integer in the subarray A[i], A[i + 1],..., A[j]. In this paper we consider static data structures in the word-RAM for range selection and severa ..."
Abstract
-
Cited by 22 (5 self)
- Add to MetaCart
(Show Context)
Range selection is the problem of preprocessing an input array A of n unique integers, such that given a query (i, j, k), one can report the k’th smallest integer in the subarray A[i], A[i + 1],..., A[j]. In this paper we consider static data structures in the word-RAM for range selection and several natural special cases thereof. The first special case is known as range median, which arises when k is fixed to ⌊(j − i + 1)/2⌋. The second case, denoted prefix selection, arises when i is fixed to 0. Finally, we also consider the bounded rank prefix selection problem and the fixed rank range selection problem. In the former, data structures must support prefix selection queries under the assumption that k ≤ κ for some value κ ≤ n given at construction time, while in the latter, data structures must support range selection queries where k is fixed beforehand for all queries. We prove cell probe lower bounds for range selection, prefix selection and range median, stating that any data structure that uses S words of space needs Ω(log n / log(Sw/n)) time to answer a query. In particular, any data structure that uses n log O(1) n space needs Ω(log n / log log n) time to answer a query, and any data structure that supports queries in constant time, needs n 1+Ω(1) space. For data structures that uses n log O(1) n space this matches the best known upper bound. Additionally, we present a linear space data structure that supports range selection queries in O(log k / log log n + log log n) time. Finally, we prove that any data structure that uses S space, needs Ω(log κ / log(Sw/n)) time to answer a bounded rank prefix selection query and Ω(log k / log(Sw/n)) time to answer a fixed rank range selection query. This shows that our data structure is optimal except for small values of k. 1
Linear-Space Approximate Distance Oracles for Planar, Bounded-Genus, and Minor-Free Graphs
"... Abstract. A (1 + ɛ)-approximate distance oracle for a graph is a data structure that supports approximate point-to-point shortest-path-distance queries. The relevant measures for a distance-oracle construction are: space, query time, and preprocessing time. There are strong distance-oracle construct ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
Abstract. A (1 + ɛ)-approximate distance oracle for a graph is a data structure that supports approximate point-to-point shortest-path-distance queries. The relevant measures for a distance-oracle construction are: space, query time, and preprocessing time. There are strong distance-oracle constructions known for planar graphs (Thorup) and, subsequently, minor-excluded graphs (Abraham and Gavoille). However, these require Ω(ɛ −1 n lg n) space for n-node graphs. We argue that a very low space requirement is essential. Since modern computer architectures involve hierarchical memory (caches, primary memory, secondary memory), a high memory requirement in effect may greatly increase the actual running time. Moreover, we would like data structures that can be deployed on small mobile devices, such as handhelds, which have relatively small primary memory. In this paper, for planar graphs, bounded-genus graphs, and minorexcluded graphs we give distance-oracle constructions that require only
Approximate Distance Queries and Compact Routing in Sparse Graphs
"... Abstract—An approximate distance query data structure is a compact representation of a graph, and can be queried to approximate shortest paths between any pair of vertices. Any such data structure that retrieves stretch 2k−1 paths must require spaceΩ(n 1+1/k) for graphs of n nodes. The hard cases th ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
(Show Context)
Abstract—An approximate distance query data structure is a compact representation of a graph, and can be queried to approximate shortest paths between any pair of vertices. Any such data structure that retrieves stretch 2k−1 paths must require spaceΩ(n 1+1/k) for graphs of n nodes. The hard cases that enforce this lower bound are, however, rather dense graphs with average degreeΩ(n 1/k). We present data structures that, for sparse graphs, substantially break that lower bound barrier at the expense of higher query time. For instance, general graphs require O(n 3/2) space and constant query time for stretch 3 paths. For the realistic scenario of a graph with average degreeΘ(log n), special cases of our data structures retrieve stretch 2 paths with Õ(n 3/2) space and stretch 3 paths with Õ(n) space, albeit at the cost of Õ ( � n) query time. Moreover, supported by large-scale simulations on graphs including the AS-level Internet graph, we argue that our stretch-2 scheme would be simple and efficient to implement as a distributed compact routing protocol. I.
Higher cell probe lower bounds for evaluating polynomials
- In Proc. 53rd IEEE Symposium on Foundations of Computer Science
, 2012
"... Abstract—In this paper, we study the cell probe complexity of evaluating an n-degree polynomial P over a finite field F of size at least n 1+Ω(1). More specifically, we show that any static data structure for evaluating P (x), where x ∈ F, must use Ω(lg |F| / lg(Sw/n lg |F|)) cell probes to answer a ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we study the cell probe complexity of evaluating an n-degree polynomial P over a finite field F of size at least n 1+Ω(1). More specifically, we show that any static data structure for evaluating P (x), where x ∈ F, must use Ω(lg |F| / lg(Sw/n lg |F|)) cell probes to answer a query, where S denotes the space of the data structure in number of cells and w the cell size in bits. This bound holds in expectation for randomized data structures with any constant error probability δ < 1/2. Our lower bound not only improves over the Ω(lg |F| / lg S) lower bound of Miltersen [TCS’95], but is in fact the highest static cell probe lower bound to date: For linear space (i.e. S = O(n lg |F|/w)), our query time lower bound simplifies to Ω(lg |F|), whereas the highest previous lower bound for any static data structure problem having d different queries is Ω(lg d / lg lg d), which was first achieved by Pǎtras¸cu and Thorup [SICOMP’10]. We also use the recent technique of Larsen [STOC’12] to show a lower bound of tq = Ω(lg |F | lg n / lg(wtu / lg |F|) lg(wtu)) for dynamic data structures for polynomial evaluation over a finite field F of size Ω(n 2). Here tq denotes the expected query time and tu the worst case update time. This lower bound holds for randomized data structures with any constant error probability δ < 1/2. This is only the second time a lower bound beyond max{tu, tq} = Ω(max{lg n, lg d / lg lg d}) has been achieved for dynamic data structures, where d denotes the number of different queries and updates to the problem. Furthermore, it is the first such lower bound that holds for randomized data structures with a constant probability of error. Keywords-cell probe model, lower bounds, data structures, polynomials I.
Exact Distance Oracles for Planar Graphs
, 2010
"... We provide the first linear-space data structure with provable sublinear query time for exact point-topoint shortest path queries in planar graphs. We prove that for any planar graph G with non-negative arc lengths and for any ɛ> 0 there is a data structure that supports exact shortest path and d ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
We provide the first linear-space data structure with provable sublinear query time for exact point-topoint shortest path queries in planar graphs. We prove that for any planar graph G with non-negative arc lengths and for any ɛ> 0 there is a data structure that supports exact shortest path and distance queries in G with the following properties: the data structure can be created in time O(n lg(n) lg(1/ɛ)), the space required is O(n lg(1/ɛ)), and the query time is O(n 1/2+ɛ). Previous data structures by Fakcharoenphol and Rao (JCSS’06), Klein, Mozes, and Weimann (TransAlg’10), and Mozes and Wulff-Nilsen (ESA’10) with query time O(n 1/2 lg 2 n) use space at least Ω(n lg n / lg lg n). We also give a construction with a more general tradeoff. We prove that for any integer S ∈ [n lg n, n 2], we can construct in time Õ(S) a data structure of size O(S) that answers distance queries in O(nS −1/2 lg 2.5 n) time per query. Cabello (SODA’06) gave a comparable construction for the smaller range S ∈ [n 4/3 lg 1/3 n, n 2]. For the range S ∈ (n lg n, n 4/3 lg 1/3 n), only data structures of size O(S) with query time O(n 2 /S) had been known (Djidjev, WG’96). Combined, our results give the best query times for any shortest-path data structure for planar graphs with space S = o(n 4/3 lg 1/3 n). As a consequence, we also obtain an algorithm that computes k–many distances in planar graphs in time O((kn) 2/3 (lg n) 2 (lg lg n) −1/3 + n(lg n) 2 / lg lg n). 1
Distance Oracles for Vertex-Labeled Graphs
"... Abstract. Given a graph G = (V, E) with non-negative edge lengths whose vertices are assigned a label from L = {λ1,..., λℓ}, we construct a compact distance oracle that answers queries of the form: “What is δ(v, λ)?”, where v ∈ V is a vertex in the graph, λ ∈ L a vertex label, and δ(v, λ) is the dis ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Given a graph G = (V, E) with non-negative edge lengths whose vertices are assigned a label from L = {λ1,..., λℓ}, we construct a compact distance oracle that answers queries of the form: “What is δ(v, λ)?”, where v ∈ V is a vertex in the graph, λ ∈ L a vertex label, and δ(v, λ) is the distance (length of a shortest path) between v and the closest vertex labeled λ in G. We formalize this natural problem and provide a hierarchy of approximate distance oracles that require subquadratic space and return a distance of constant stretch. We also extend our solution to dynamic oracles that handle label changes in sublinear time. 1
On approximate distance labels and routing schemes with affine stretch
- IN INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING (DISC
, 2011
"... For every integral parameter k> 1, given an unweighted graph G, we construct in polynomial time, for each vertex u, adistance label L(u) of size Õ(n2/(2k−1)). For any u, v ∈ G, givenL(u),L(v) we can return in time O(k) an affine approximation ˆ d(u, v) on the distance d(u, v) between u and v in ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
For every integral parameter k> 1, given an unweighted graph G, we construct in polynomial time, for each vertex u, adistance label L(u) of size Õ(n2/(2k−1)). For any u, v ∈ G, givenL(u),L(v) we can return in time O(k) an affine approximation ˆ d(u, v) on the distance d(u, v) between u and v in G such that d(u, v) � ˆ d(u, v) � (2k − 2)d(u, v) +1. Hence we say that our distance label scheme has affine stretch of (2k − 2)d +1.Fork=2our construction is comparable to the O(n 5/3) size, 2d +1 affine stretch of the distance oracle of Pǎtraşcu and Roditty (FOCS ’10), it incurs a o(log n) storage overhead while providing the benefits of a distance label. For any k>1, givena restriction of o(n 1+1/(k−1) ) on the total size of the data structure, our construction provides distance labels with affine stretch of (2k − 2)d +1 which is better than the stretch (2k − 1)d scheme of Thorup and Zwick (J. ACM ’05). Our second contribution is a compact routing scheme with poly-logarithmic addresses that provides affine stretch guarantees. With Õ(n 3/(3k−2))-bit routing tables we obtain affine stretch of (4k − 6)d +1, for any k>1. Given a restriction of o(n 1/(k−1) ) on the table size, our routing scheme provides affine stretch which is better than the stretch (4k − 5)d routing scheme of Thorup and Zwick (SPAA ’01).
A compact routing scheme and approximate distance oracle for power-law graphs
, 2009
"... Abstract. Compact routing addresses the tradeoff between table sizes and stretch, which is the worst-case ratio between the length of the path a packet is routed through by the scheme and the length of a shortest path from source to destination. We adapt the compact routing scheme by Thorup and Zwic ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. Compact routing addresses the tradeoff between table sizes and stretch, which is the worst-case ratio between the length of the path a packet is routed through by the scheme and the length of a shortest path from source to destination. We adapt the compact routing scheme by Thorup and Zwick to optimize it for power-law graphs. We analyze our adapted routing scheme based on the theory of unweighted random power-law graphs with fixed expected degree sequence by Aiello, Chung, and Lu. Our result is the first theoretical bound coupled to the parameter of the power-law graph model for a compact routing scheme. In particular, we prove that, for stretch 3, instead of routing tables with Õ(n1/2) bits as in the general scheme by Thorup and Zwick, expected sizes of O(n γ log n) bits are sufficient, and that all the routing tables can be constructed at once in expected time O(n 1+γ log n), with γ = τ−2 2τ−3 + ε, where τ ∈ (2, 3) is the power-law exponent and ε> 0 (which implies ε < γ < 1/3 + ε). Both bounds also hold with probability at least 1 − 1/n (independent of ε). The routing scheme is a labeled scheme, requiring a stretch-5 handshaking step and using addresses and message headers with O(log n log log n) bits, with probability at least 1 − o(1). We further demonstrate the effectiveness of our scheme by simulations on real-world graphs as well as synthetic power-law graphs. With the same techniques as for the compact routing scheme, we also adapt the approximate distance oracle by Thorup and Zwick for stretch 3 and obtain a new upper bound of expected Õ(n1+γ) for space and preprocessing for random power-law graphs. 1
More compact oracles for approximate distances in undirected planar graphs
- In SODA ’13
, 2013
"... Distance oracles are data structures that provide fast (possibly approximate) answers to shortest-path and distance queries in graphs. The tradeoff between the space requirements and the query time of distance oracles is of particular interest and the main focus of this paper. In FOCS‘01, Thorup int ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Distance oracles are data structures that provide fast (possibly approximate) answers to shortest-path and distance queries in graphs. The tradeoff between the space requirements and the query time of distance oracles is of particular interest and the main focus of this paper. In FOCS‘01, Thorup introduced approximate distance oracles for planar graphs. He proved that, for any > 0 and for any planar graph on n nodes, there exists a (1 + )–approximate distance oracle using space O(n−1 logn) such that approximate distance queries can be answered in time O(−1). Ten years later, we give the first improvements on the space–query time tradeoff for planar graphs. • We give the first oracle having a space–time product with subquadratic dependency on 1/. For space Õ(n logn) we obtain query time Õ(−1) (assuming polynomial edge weights). We believe that the dependency on may be almost optimal. • For the case of moderate edge weights (average bounded by poly(logn), which appears to be the case for many real-world road networks), we hit a “sweet spot, ” improving upon Thorup’s oracle both in terms of and n. Our oracle uses space Õ(n log log n) and it has query time Õ(−1 + log log log n). (Notation: Õ(·) hides low-degree polynomials in log(1/) and log∗(n).) ar X iv