Results 1 - 10
of
57
A Simple Algorithm for Nearest Neighbor Search in High Dimensions
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... Abstract—The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In ne ..."
Abstract
-
Cited by 111 (1 self)
- Add to MetaCart
Abstract—The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In nearly all applications, the closest point is of interest only if it lies within a user-specified distance e. We present a simple and practical algorithm to efficiently search for the nearest neighbor within Euclidean distance e. The use of projection search combined with a novel data structure dramatically improves performance in high dimensions. A complexity analysis is presented which helps to automatically determine e in structured problems. A comprehensive set of benchmarks clearly shows the superiority of the proposed algorithm for a variety of structured and unstructured search problems. Object recognition is demonstrated as an example application. The simplicity of the algorithm makes it possible to construct an inexpensive hardware search engine which can be 100 times faster than its software equivalent. A C++ implementation of our algorithm is available upon request to search@cs.columbia.edu/CAVE/.
Experience with distributed programming in Orca
- in Proc. IEEE CS International Conference on Computer Languages
, 1990
"... Orca is a language for programming parallel applications on distributed computing systems. Although processors in such systems communicate only through message passing and not through shared memory, data types and create instances (objects) of these types, which may be shared among processes. All op ..."
Abstract
-
Cited by 39 (9 self)
- Add to MetaCart
Orca is a language for programming parallel applications on distributed computing systems. Although processors in such systems communicate only through message passing and not through shared memory, data types and create instances (objects) of these types, which may be shared among processes. All operations on shared objects are executed atomically. Orca’s shared objects are implemented by replicating them in the local memories of the proces-sors. Read operations use the local copies of the object, without doing any interprocess communication. Write operations update all copies using an efficient reliable broadcast protocol. In this paper, we briefly describe the language and its implementation and then report on our ex-periences in using Orca for three parallel applications: the Traveling Salesman Problem, the All-pairs Shortest Paths problem, and Successive Overrelaxation. These applications have different needs for shared data: TSP greatly benefits from the support for shared data; ASP benefits from the use of broad-cast communication, even though it is hidden in the implementation; SOR merely requires point-to-point communication, but still can be implemented in the language by simulating message passing.
The computational complexity of decentralized discrete-event control problems
, 1993
"... Computational complexity results are obtained for decentralized discrete-event system problems. These results generalize the earlier work of Tsitsiklis, who showed that for centralized supervisory control problems (under partial observation), solution existence is decidable in polynomial time for a ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
Computational complexity results are obtained for decentralized discrete-event system problems. These results generalize the earlier work of Tsitsiklis, who showed that for centralized supervisory control problems (under partial observation), solution existence is decidable in polynomial time for a special type of problem but becomes computationally intractable for the general class. As in the case of centralized control, there is no polynomial-time algorithm for producing supervisor solutions.
A Davidson program for finding a few selected extreme eigenpairs of a large, sparse, real, symmetric matrix
"... A program is presented for determining a few selected eigenvalues and their eigenvectors on either end of the spectrum of a large, real, symmetric matrix. Based on the Davidson method, which is extensively used in quantum chemistry/physics, the current implementation improves the power of the origin ..."
Abstract
-
Cited by 24 (9 self)
- Add to MetaCart
A program is presented for determining a few selected eigenvalues and their eigenvectors on either end of the spectrum of a large, real, symmetric matrix. Based on the Davidson method, which is extensively used in quantum chemistry/physics, the current implementation improves the power of the original algorithm by adopting several extensions. The matrix-vector multiplication routine that it requires is to be provided by the user. Different matrix formats and optimizations are thus feasible. Examples of an efficient sparse matrix representation and a matrix-vector multiplication are given. Some comparisons with the Lanczos method demonstrate the efficiency of the program. PROGRAM SUMMARY Title of program: DVDSON Catalogue Number: To be assigned Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland (see application form in this issue). Licensing provisions: none Computer: Sun-3/80, Sun SPARCstation IPC, Intel iPSC/860. Operating system: SunOS Rele...
Automated Parallelization of Discrete State-space Generation
- Journal of Parallel and Distributed Computing
, 1997
"... We consider the problem of generating a large state-space in a distributed fashion. Unlike previously proposed solutions that partition the set of reachable states according to a hashing function provided by the user, we explore heuristic methods that completely automate the process. The first step ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
We consider the problem of generating a large state-space in a distributed fashion. Unlike previously proposed solutions that partition the set of reachable states according to a hashing function provided by the user, we explore heuristic methods that completely automate the process. The first step is an initial random walk through the state space to initialize a search tree, duplicated in each processor. Then, the reachability graph is built in a distributed way, using the search tree to assign each newly found state to classes assigned to the available processors. Furthermore, we explore two remapping criteria that attempt to balance memory usage or future workload, respectively. We show how the cost of computing the global snapshot required for remapping will scale up for system sizes in the foreseeable future. An extensive set of results is presented to support our conclusions that remapping is extremely beneficial. 1 Introduction Discrete systems are frequently analyzed by genera...
Equalization Concepts for EDGE
- IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS
, 1999
"... In this paper, an equalization concept for the novel radio access scheme EDGE (Enhanced Data Rates for GSM Evolution) is proposed, by which high performance can be obtained at moderate computational complexity. Because high-level modulation is employed in EDGE, optimum equalization as usually perfor ..."
Abstract
-
Cited by 19 (9 self)
- Add to MetaCart
In this paper, an equalization concept for the novel radio access scheme EDGE (Enhanced Data Rates for GSM Evolution) is proposed, by which high performance can be obtained at moderate computational complexity. Because high-level modulation is employed in EDGE, optimum equalization as usually performed in GSM (Global System for Mobile Communications) receivers is too complex, and suboptimum schemes have to be considered. It is shown that delayed decision-feedback sequence estimation (DDFSE) and reduced-state sequence estimation (RSSE) are promising candidates. For various channel profiles, approximations for the bit error rate of these suboptimum equalization techniques are given and compared with simulation results for DDFSE. It turns out that a discrete-time prefilter creating a minimum-phase overall impulse response is indispensible for a favourable tradeoff between performance and complexity. Additionally, the influence of channel estimation and of the receiver input filter is investigated, and the reasons for performance degradation compared to the additive white Gaussian noise channel are indicated. Finally, the overall system performance attainable with the proposed equalization concept is determined for transmission with channel coding.
On Crossing Minimization Problem
, 1998
"... In this paper we consider a problem related to global routing post-optimization: the crossing minimization problem (CMP). Given a global routing representation, the CMP is to minimize redundant crossings between every pair of nets. In particular, there are two kinds of CMP: constrained CMP (CCMP) an ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
In this paper we consider a problem related to global routing post-optimization: the crossing minimization problem (CMP). Given a global routing representation, the CMP is to minimize redundant crossings between every pair of nets. In particular, there are two kinds of CMP: constrained CMP (CCMP) and unconstrained CMP (UCMP). These problems have been studied previously in [Groe89], where an O(m 2 n) algorithm was proposed for CCMP, and in [MS95], where an (mn 2 +¸ 2 ) algorithm was proposed for UCMP, where m is the total number of modules, n is the number of nets, and ¸ is the number of crossings defined by an initial global routing topology. We present a simpler and faster O(mn) algorithm for CCMP and an O(n(m + ¸)) time algorithm for UCMP. Both algorithms improve over the time bounds of the previously proposed algorithms. The novel part of our algorithm is that it uses the plane embedding information of globally routed nets in the routing area to construct a graph-based framewo...
On Indexing Sliding Windows over Online Data Streams
- In EDBT
, 2004
"... Abstract. We consider indexing sliding windows in main memory over on-line data streams. Our proposed data structures and query semantics are based on a division of the sliding window into sub-windows. By classifying windowed operators according to their method of execution, we motivate the need for ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Abstract. We consider indexing sliding windows in main memory over on-line data streams. Our proposed data structures and query semantics are based on a division of the sliding window into sub-windows. By classifying windowed operators according to their method of execution, we motivate the need for two types of windowed indices: those which provide a list of attribute values and their counts for answering set-valued queries, and those which provide direct access to tuples for answering attribute-valued queries. We propose and evaluate indices for both of these cases and show that our techniques are more efficient than executing windowed queries without an index. 1
Using a Large Linguistic Ontology for Internet-based Retrieval of Object-Oriented Components
- In Proceedings of 1997 Conference on Software Engineering and Knowledge Engineering. Madrid, Knowledge Systems Institute
, 1997
"... this paper adopts a language of limited expressiveness, privileging the simplicity of use as the most important requirement. We adopt a very simple graph structure for representing both queries and component data, but -- differently from most of current systems -- we do not assume the user to have f ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
this paper adopts a language of limited expressiveness, privileging the simplicity of use as the most important requirement. We adopt a very simple graph structure for representing both queries and component data, but -- differently from most of current systems -- we do not assume the user to have familiarity with the vocabulary used for component encoding, relying on a large linguistic ontology like Sensus [Swartout et al. 1996] to perform the match between queries and data. In the encoding phase (which we assume to be a manual process supported by an interactive environment), a software analyst describes a component by a simple graph where nodes and arcs are labelled with English words. Since binary relations are not usually denoted by nouns, a special semantics is adopted for this graph, which is called Lexical Semantic Graph. English nouns appearing in the graph are recognized by a lexical interface based on Wordnet [Miller 1995], which asks the analyst to choose among possibly different senses associated to each word. The graph of words is therefore translated into a graph of senses, each one corresponding to a node in the Sensus ontology. The query graph is built by the user in a similar way, but the words chosen and the corresponding senses can be of course different, as well as the structure of the graph. Conceptually, the search process implements a graph matching algorithm, returning the identifiers of all components whose description is subsumed by the query. In the following section, we describe the main design choices of a project on software retrieval currently going on at Corinto 1 , a research consortium established to study and promote object-oriented technology. In section 3 we present the encoding and retrieval process in some detail, with the help of...
Cache Investment: Integrating Query Optimization and Distributed Data Placement
- ACM TODS
, 2000
"... Emerging distributed query processing systems support... In this paper, we propose Cache Investment mechanisms and policies and analyze their performance. The analysis uses results from both an implementation on the SHORE storage manager and a detailed simulation model. Our results show that Cache I ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Emerging distributed query processing systems support... In this paper, we propose Cache Investment mechanisms and policies and analyze their performance. The analysis uses results from both an implementation on the SHORE storage manager and a detailed simulation model. Our results show that Cache Investment can significantly improve the overall performance of a system and demonstrate the tradeoffs among various alternative policies.

