Results 1  10
of
20
Scalable Network Distance Browsing in Spatial Databases
, 2008
"... An algorithm is presented for finding the k nearest neighbors in a spatial network in a bestfirst manner using network distance. The algorithm is based on precomputing the shortest paths between all possible vertices in the network and then making use of an encoding that takes advantage of the fact ..."
Abstract

Cited by 46 (8 self)
 Add to MetaCart
An algorithm is presented for finding the k nearest neighbors in a spatial network in a bestfirst manner using network distance. The algorithm is based on precomputing the shortest paths between all possible vertices in the network and then making use of an encoding that takes advantage of the fact that the shortest paths from vertex u to all of the remaining vertices can be decomposed into subsets based on the first edges on the shortest paths to them from u. Thus, in the worst case, the amount of work depends on the number of objects that are examined and the number of links on the shortest paths to them from q, rather than depending on the number of vertices in the network. The amount of storage required to keep track of the subsets is reduced by taking advantage of their spatial coherence which is captured by the aid of a shortest path quadtree. In particular, experiments on a number of large road networks as
Path Oracles for Spatial Networks
, 2009
"... The advent of locationbased services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A line ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
The advent of locationbased services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A linearsized construct termed a path oracle is introduced that compactly encodes the n2 shortest paths between every pair of vertices in a spatial network having n vertices thereby reducing each of the paths to a single tuple in a relational database and enables finding shortest paths by repeated application of a single SQL SELECT operator. The construction of the path oracle is based on the observed coherence between the spatial positions of both source and destination vertices and the shortest paths between them which facilitates the aggregation of source and destination vertices into groups that share common vertices or edges on the shortest paths between them. With the aid of the WellSeparated Pair (WSP) technique, which has been applied to spatial networks using the network distance measure, a path oracle is proposed that takes O(sdn) space, where s is empirically estimated to be around 12 for road networks, but that can retrieve an intermediate link in a shortest path in O(logn) time using a Btree. An additional construct termed the pathdistance oracle of size O(n · max(sd, 1 d ε)) (empirically (n · max(122, 2.5 2 ε))) is proposed that can retrieve an intermediate vertex as well as an εapproximation of the network distances in O(logn) time using a Btree. Experimental results indicate that the proposed oracles are linear in n which means that they are scalable and can enable complicated query processing scenarios on massive spatial network datasets.
Distance Oracles for Spatial Networks
"... Abstract — The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge is that these operations involve the computation o ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Abstract — The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge is that these operations involve the computation of distance along a spatial network rather than “as the crow flies. ” In many applications an estimate of the distance is sufficient, which can be achieved by use of an oracle. An approximate distance oracle is proposed for spatial networks that exploits the coherence between the spatial position of vertices and the network distance between them. Using this observation, a distance oracle is introduced that is able to obtain the εapproximate network distance between two vertices of the spatial network. The network distance between every pair of vertices in the spatial network is efficiently represented by adapting the wellseparated pair technique to spatial networks. Initially, use is made of an εapproximate distance oracle of size O ( n εd) that is capable of retrieving the approximate network distance in O(logn) time using a Btree. The retrieval time can be theoretically reduced to O(1) time by proposing another εapproximate distance oracle of size O ( nlogn εd) that uses a hash table. Experimental results indicate that the proposed technique is scalable and can be applied to sufficiently large road networks. A 10%approximate oracle (ε = 0.1) on a large network yielded an average error of 0.9 % with 90 % of the answers making an error of 2 % or less and an average retrieval time of 68µ seconds. Finally, a strategy for the integration of the distance oracle into any relational database system as well as using it to perform a variety of spatial queries such as region search, knearest neighbor search, and spatial joins on spatial networks is discussed. I.
Query processing using distance oracles for spatial networks
 Best Papers of ICDE 2009 Special Issue
"... Abstract—The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial oper ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract—The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial operations usually involves the computation of distance along a spatial network instead of “as the crow flies, ” which is not simple. Techniques are described that enable the determination of the network distance between any pair of points (i.e., vertices) with as little as OðnÞ space rather than having to store the n2 distances between all pairs. This is done by being willing to expend a bit more time to achieve this goal such as Oðlog nÞ instead of Oð1Þ, as well as by accepting an error " in the accuracy of the distance that is provided. The strategy that is adopted reduces the space requirements and is based on the ability to identify groups of source and destination vertices for which the distance is approximately the same within some ". The reductions are achieved by introducing a construct termed a distance oracle that yields an estimate of the network distance (termed the "approximate distance) between any two vertices in the spatial network. The distance oracle is obtained by showing how to adapt the wellseparated pair technique from computational geometry to spatial networks. Initially, an "approximate distance oracle of size Oð n " dÞ is used that is capable of retrieving the approximate network distance in Oðlog nÞ time using a Btree. The retrieval time can be theoretically reduced n log n further to Oð1Þ time by proposing another "approximate distance oracle of size Oð
Online Document Clustering Using the GPU
, 2010
"... Online document clustering takes as its input a list of document vectors, ordered by time. A document vector consists of a list of K terms and their associated weights. The generation of terms and their weights from the document text may vary, but the TFIDF (term frequencyinverse document frequenc ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Online document clustering takes as its input a list of document vectors, ordered by time. A document vector consists of a list of K terms and their associated weights. The generation of terms and their weights from the document text may vary, but the TFIDF (term frequencyinverse document frequency) method is popular for clustering applications [1]. The assumption is that the resulting document vector is a good overall representation of the original document. We note that the dimensionality of the document vectors is very high (potentially infinite), since a document could potentially contain any word (term). We also note that the vectors are sparse in the sense that most term weights have a zero value. We assume that each term not explicitly present in a particular document vector has a weight of zero. Document vectors are normalized. Clusters are also represented as a list of weighted terms. At any given time, a cluster’s term vector is equal to the average of all the document vector’s contained by the cluster. Cluster term vectors are truncated to the top K terms (those containing the highest term weights). Cluster term vectors are kept normalized. The objective of the algorithm is to partition the set of document vectors into a set of clusters, each cluster containing only those documents which are similar to each other with respect to some metric. For this paper, we consider the Euclidean dot product as the similarity metric, as it has been shown to provide good results with the TFIDF metric [1]. The similarity between a cluster and a document is defined as the dot product between their term vectors. We first present serial a algorithm for online clustering. We then describe a PRAM algorithm for parallel online clustering, assuming a CRCW model. Finally, we present a practical implementation of an approximate parallel online clustering algorithm, suitable for the CUDA parallel computing architecture [2]. 1. Serial Clustering 1 The basic serial online clustering algorithm takes as input a list of n document vectors, as well as a clustering threshold T ranging between 0 and 1. Below is a high level overview of the algorithm.
Roads Belong in Databases
"... The popularity of locationbased services and the need to perform realtime processing on them has led to an interest in queries on road networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of operations usually involves the compu ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
The popularity of locationbased services and the need to perform realtime processing on them has led to an interest in queries on road networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of operations usually involves the computation of distance along a spatial network instead of “as the crow flies, ” which is not simple. This requires the precomputation of the shortest paths and network distance between every pair of points (i.e., vertices) with as little space as possible rather than having to store the n 2 shortest paths and distances between all pairs. This problem is related to a ‘holy grail ’ problem in databases of how to incorporate road networks into relational databases. A data structure called a road network oracle is introduced that resides in a database and enables the processing of many operations on road networks with just the aid of relational operators. Two implementations of road network oracles are presented. 1
Database and Representation Issues in Geographic Information Systems (GIS) ⋆
"... Abstract. A review is provided of some database and representation issues involved in the implementation of geographic information systems (GIS). The increasing popularity of webbased mapping systems such as Microsoft Virtual Earth and Google Earth and Maps, as well as other software offerings that ..."
Abstract
 Add to MetaCart
Abstract. A review is provided of some database and representation issues involved in the implementation of geographic information systems (GIS). The increasing popularity of webbased mapping systems such as Microsoft Virtual Earth and Google Earth and Maps, as well as other software offerings that are coupled with portable devices, such as the iPhone, has led to a proliferation of services that are characterized as being locationbased. The data provided by these services is differentiated from other offerings by the presence of a locational component. In the past, this type of data was found primarily in geographic information systems (GIS). The available technology led to a focus on the paper map as the output device for responses. Since anything is better than drawing by hand, there was little emphasis on efficiency measures such as minimization of execution time. However, the emergence of of display devices has changed the mode of operation to one of expecting answers relatively quickly. This has had a number of effects. First, the paper medium supports relatively high resolution output while the display screen is usually of a more limited resolution, thereby enabling the use of less precise algorithms.
SORTING SPATIAL DATA BY SPATIAL OCCUPANCY
 GEOSPATIAL VISUAL ANALYTICS: GEOGRAPHICAL ORMATION PROCESSING AND VISUAL ANALYTICS FOR ENVIRONMENTAL SECURITY
, 2009
"... The increasing popularity of webbased mapping services such as Microsoft Virtual Earth and Google Maps/Earth has led to a dramatic increase in awareness of the importance of location as a component of data for the purposes of further processing as a means of enhancing the value of the nonspatial da ..."
Abstract
 Add to MetaCart
The increasing popularity of webbased mapping services such as Microsoft Virtual Earth and Google Maps/Earth has led to a dramatic increase in awareness of the importance of location as a component of data for the purposes of further processing as a means of enhancing the value of the nonspatial data and of visualization. Both of these purposes inevitably involve searching. The efficiency of searching is dependent on the extent to which the underlying data is sorted. The sorting is encapsulated by the data structure known as an index that is used to represent the spatial data thereby making it more accessible. The traditional role of the indexes is to sort the data, which means that they order the data. However, since generally no ordering exists in dimensions greater than 1 without a transformation of the data to one dimension, the role of the sort process is one of differentiating between the data and what is usually done is to sort the spatial objects with respect to the space that they occupy. The resulting ordering should be implicit rather than explicit so that the data need not be resorted (i.e., the index need not be rebuilt) when the queries change. The indexes are said to order the space and the characteristics of such indexes are explored further.
Center for the Computational Analysis of Social and Organizational Systems
, 2010
"... Increasingly, the data available to network analysts includes not only relationships between entities but the observation of entity attributes and relations in geographic space. Integrating this information with existing dynamic network analysis techniques demands new models and new tools. This pape ..."
Abstract
 Add to MetaCart
Increasingly, the data available to network analysts includes not only relationships between entities but the observation of entity attributes and relations in geographic space. Integrating this information with existing dynamic network analysis techniques demands new models and new tools. This paper introduces extensions to the ORA dynamic network analysis platform intended to meet this need. In particular, we present new visualization techniques for displaying the network topology of large, noisy datasets embedded in geographic space. We present these extensions and demonstrate them on some sample datasets.
4/5/2010 Critique & Summary of Scalable Network Distance Browsing in Spatial Databases
"... In today’s world, the mapping services like Google Maps, Microsoft MapPoint etc have gained significant popularity for finding the shortest route while travelling from place to place. These services are available online and hence the responses of it must be real time i.e. very quick. Study has revea ..."
Abstract
 Add to MetaCart
In today’s world, the mapping services like Google Maps, Microsoft MapPoint etc have gained significant popularity for finding the shortest route while travelling from place to place. These services are available online and hence the responses of it must be real time i.e. very quick. Study has revealed that people do not change the parent query however may change the sub queries within it. This means people