Results 1 -
3 of
3
Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces
, 1993
"... We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are high-dim ..."
Abstract
-
Cited by 225 (4 self)
- Add to MetaCart
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The vp-tree (vantage point tree) is introduced in several forms, together with associated algorithms, as an improved method for these difficult search problems. Tree construction executes in O(n log(n)) time, and search is under certain circumstances and in the limit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.
Triphone Analysis: A Combined Method for the Correction of Orthographical And Typographical Errors
, 1988
"... Most existing systems for the correction of word level errors are oriented toward either typographical or orthographical errors. Triphone analysis is a new correction strategy which combines phonemic transcription with trigram analysis. It corrects both kinds of errors (also in combination) and is s ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Most existing systems for the correction of word level errors are oriented toward either typographical or orthographical errors. Triphone analysis is a new correction strategy which combines phonemic transcription with trigram analysis. It corrects both kinds of errors (also in combination) and is superior for orthographical errors.
Normalized Forms for Two Common Metrics
- NEC Research Institute, Report 91-082-9027-1, 1991, Revision 7/7/2002. http://www.pnylab.com/pny
, 1991
"... In this paper we demonstrate that two common metrics, symmetric set difference, and Euclidian distance, have normalized forms which are nevertheless metrics. The first of these jA4Bj=jA[Bj is easily established and generalizes to measure spaces. The second applies to vectors in R n and is given ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper we demonstrate that two common metrics, symmetric set difference, and Euclidian distance, have normalized forms which are nevertheless metrics. The first of these jA4Bj=jA[Bj is easily established and generalizes to measure spaces. The second applies to vectors in R n and is given by kX \GammaY k=(kXk+kY k). That this is a metric is more difficult to demonstrate and is true for Euclidian distance (the L2 norm) but for no other integral Minkowski metric. In addition to providing bounded distances when no a priori data bound exists, these forms are qualitatively different from their unnormalized counterparts, and are therefore also distinguished from simpler range companded constructions. Mixed forms are also defined which combine absolute and relative behavior, while remaining metrics. The result is a family of forms which resemble commonly used dissimilarity statistics but obey the triangle inequality. Keywords --- Metric Space, Distance Function, Similarity Funct...

