Results 1  10
of
34
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 81 (36 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
Towards Optimal Locality in MeshIndexings
, 1997
"... The efficiency of many data structures and algorithms relies on "localitypreserving" indexing schemes for meshes. We concentrate on the case in which the maximal distance between two mesh nodes indexed i and j shall be a slowgrowing function of ji jj. We present a new 2D indexing scheme ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
The efficiency of many data structures and algorithms relies on "localitypreserving" indexing schemes for meshes. We concentrate on the case in which the maximal distance between two mesh nodes indexed i and j shall be a slowgrowing function of ji jj. We present a new 2D indexing scheme we call Hindexing , which has superior (possibly optimal) locality in comparison with the wellknown Hilbert indexings. Hindexings form a Hamiltonian cycle and we prove that they are optimally localitypreserving among all cyclic indexings. We provide fairly tight lower bounds for indexings without any restriction. Finally, illustrated by investigations concerning 2D and 3D Hilbert indexings, we present a framework for mechanizing upper bound proofs for locality.
Parallel Domain Decomposition and Load Balancing Using SpaceFilling Curves
 in Proceedings of the 4th IEEE Conference on High Performance Computing
, 1997
"... Partitioning techniques based on spacefilling curves have received much recent attention due to their low running time and good load balance characteristics. The basic idea underlying these methods is to order the multidimensional data according to a spacefilling curve and partition the resulting ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
Partitioning techniques based on spacefilling curves have received much recent attention due to their low running time and good load balance characteristics. The basic idea underlying these methods is to order the multidimensional data according to a spacefilling curve and partition the resulting onedimensional order. However, spacefilling curves are defined for points that lie on a uniform grid of a particular resolution. It is typically assumed that the coordinates of the points are representable using a fixed number of bits, and the runtimes of the algorithms depend upon the number of bits used. In this paper, we present a simple and efficient technique for ordering arbitrary and dynamic multidimensional data using spacefilling curves and its application to parallel domain decomposition and load balancing. Our technique is based on a comparison routine that determines the relative position of two points in the order induced by a spacefilling curve. The comparison routine could then be used...
Incremental Constructions con BRIO
, 2003
"... Randomized incremental constructions are widely used in computational geometry, but they perform very badly on large data because of their inherently random memory access patterns. We define a biased randomized insertion order which removes enough randomness to significantly improve performance, but ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
Randomized incremental constructions are widely used in computational geometry, but they perform very badly on large data because of their inherently random memory access patterns. We define a biased randomized insertion order which removes enough randomness to significantly improve performance, but leaves enough randomness so that the algorithms remain theoretically optimal.
High Dimensional Similarity Search With Space Filling Curves
 In Proceedings of the 17th International Conference on Data Engineering
, 2000
"... We present a new approach for approximate nearest neighbor queries for sets of high dimensional points under any L t metric, t = 1,2,3,... The proposed algorithm is efficient and simple to implement. The algorithm uses multiple shifted copies of the data points and stores them in up to (d + 1) Btr ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
We present a new approach for approximate nearest neighbor queries for sets of high dimensional points under any L t metric, t = 1,2,3,... The proposed algorithm is efficient and simple to implement. The algorithm uses multiple shifted copies of the data points and stores them in up to (d + 1) Btrees where d is the dimensionality of the data, sorted according to their position along a space filling curve. This is done in a way that allows us to guarantee that a neighbor within an O(d^(1+1/t)) factor of the exact nearest, can be returned with at most (d + 1) log p n page accesses, where p is the branching factor of the Btrees. In practice, for real data sets, our approximate technique finds the exact nearest neighbor between 87% and 99% of the time and a point no farther than the third nearest neighbor between 98% and 100% of the time. Our solution is dynamic, allowing insertion or deletion of points in O(d log p n) page accesses and generalizes easily to find approximate knea...
On MultiDimensional Hilbert Indexings
 Theory of Computing Systems
, 1998
"... Indexing schemes for grids based on spacefilling curves (e.g., Hilbert indexings) find applications in numerous fields, ranging from parallel processing over data structures to image processing. Because of an increasing interest in discrete multidimensional spaces, indexing schemes for them hav ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
Indexing schemes for grids based on spacefilling curves (e.g., Hilbert indexings) find applications in numerous fields, ranging from parallel processing over data structures to image processing. Because of an increasing interest in discrete multidimensional spaces, indexing schemes for them have won considerable interest. Hilbert curves are the most simple and popular spacefilling indexing scheme. We extend the concept of curves with Hilbert property to arbitrary dimensions and present first results concerning their structural analysis that also simplify their applicability. We define and analyze in a precise mathematical way rdimensional Hilbert indexings for arbitrary r 2. Moreover, we generalize and simplify previous work and clarify the concept of Hilbert curves for multidimensional grids. As we show, Hilbert indexings can be completely described and analyzed by "generating elements of order 1", thus, in comparison with previous work, reducing their structural comp...
On Multidimensional Curves with Hilbert Property
, 2000
"... Indexing schemes for grids based on spacefilling curves (e.g., Hilbert curves) find applications in numerous fields, ranging from parallel processing over data structures to image processing. Because of an increasing interest in discrete multidimensional spaces, indexing schemes for them have won c ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Indexing schemes for grids based on spacefilling curves (e.g., Hilbert curves) find applications in numerous fields, ranging from parallel processing over data structures to image processing. Because of an increasing interest in discrete multidimensional spaces, indexing schemes for them have won considerable interest. Hilbert curves are the most simple and popular spacefilling indexing schemes. We extend the concept of curves with Hilbert property to arbitrary dimensions and present first results concerning their structural analysis that also simplify their applicability.
Hybrid overlay structure based on random walks
 In Proc. of the 4th Intl. Workshop on PeertoPeer Systems (IPTPS’05
, 2005
"... Applicationlevel multicast on structured overlays often suffer several drawbacks: 1) The regularity of the architecture makes it difficult to adapt to topology changes; 2) the uniformity of the protocol generally does not consider node heterogeneity. It would be ideal to combine the scalability of ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Applicationlevel multicast on structured overlays often suffer several drawbacks: 1) The regularity of the architecture makes it difficult to adapt to topology changes; 2) the uniformity of the protocol generally does not consider node heterogeneity. It would be ideal to combine the scalability of these overlays with the flexibility of an unstructured topology. In this paper, we propose a localityaware hybrid overlay that combines the scalability and interface of a structured network with the connection flexibility of an unstructured network. Nodes selforganize into structured clusters based on network locality, while connections between clusters are created adaptively through random walks. Simulations show that this structure is efficient in both delay and bandwidth. The network also supports the scalable fast rendezvous interface provided by structured overlays, resulting in fast membership operations. 1
On the Quality of Partitions based on SpaceFilling Curves
, 2002
"... This paper presents bounds on the quality of partitions induced by spacefilling curves. We compare the surface that surrounds an arbitrary index range with the optimal partition in the grid, i. e. the square. It is shown that partitions induced by Lebesgue and Hilbert curves behave about 1.85 times ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
This paper presents bounds on the quality of partitions induced by spacefilling curves. We compare the surface that surrounds an arbitrary index range with the optimal partition in the grid, i. e. the square. It is shown that partitions induced by Lebesgue and Hilbert curves behave about 1.85 times worse with respect to the length of the surface. The Lebesgue indexing gives better results than the Hilbert indexing in worst case analysis. Furthermore, the surface of partitions based on the Lebesgue indexing are at most 3 times larger than the optimal in average case.
Scanning and sequential decision making for multidimensional data  part I: the noiseless case
 IEEE Trans. on Inform. Theory
"... We consider the problem of sequential decision making on random fields corrupted by noise. In this scenario, the decision maker observes a noisy version of the data, yet judged with respect to the clean data. In particular, we first consider the problem of sequentially scanning and filtering noisy r ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
We consider the problem of sequential decision making on random fields corrupted by noise. In this scenario, the decision maker observes a noisy version of the data, yet judged with respect to the clean data. In particular, we first consider the problem of sequentially scanning and filtering noisy random fields. In this case, the sequential filter is given the freedom to choose the path over which it traverses the random field (e.g., noisy image or video sequence), thus it is natural to ask what is the best achievable performance and how sensitive this performance is to the choice of the scan. We formally define the problem of scanning and filtering, derive a bound on the best achievable performance and quantify the excess loss occurring when nonoptimal scanners are used, compared to optimal scanning and filtering. We then discuss the problem of sequential scanning and prediction of noisy random fields. This setting is a natural model for applications such as restoration and coding of noisy images. We formally define the problem of scanning and prediction of a noisy multidimensional array and relate the optimal performance to the clean scandictability defined by Merhav and Weissman. Moreover, bounds on the excess loss due to suboptimal scans are derived, and a universal prediction algorithm is suggested.