Document Versioning Using Feature Space Distances
Abstract. The automated analysis of documents is an important task given the rapid increase in availability of digital texts. In an earlier publication, we had presented a framework where the edit distances between documents was used to reconstruct the version history of a set of documents. Howeve
Abstract
Abstract. The automated analysis of documents is an important task given the rapid increase in availability of digital texts. In an earlier publication, we had presented a framework where the edit distances between documents was used to reconstruct the version history of a set of documents
Adaptive featurespace conformal transformation for imbalanceddata learning
 In Proc. ICML
, 2003
When the training instances of the target class are heavily outnumbered by nontarget training instances, SVMs can be ineffective in determining the class boundary. To remedy this problem, we propose an adaptive conformal transformation (ACT) algorithm. ACT considers featurespace distance and the
Abstract

Cited by 29 (9 self)
When the training instances of the target class are heavily outnumbered by nontarget training instances, SVMs can be ineffective in determining the class boundary. To remedy this problem, we propose an adaptive conformal transformation (ACT) algorithm. ACT considers featurespace distance
Fastmap: A fast algorithm for indexing, datamining and visualization of traditional and multimedia datasets
, 1995
A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [Jag91]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several
Abstract

Cited by 502 (22 self)
A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [Jag91]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several
Features of similarity.
 Psychological Review
, 1977
Similarity plays a fundamental role in theories of knowledge and behavior. It serves as an organizing principle by which individuals classify objects, form concepts, and make generalizations. Indeed, the concept of similarity is ubiquitous in psychological theory. It underlies the accounts of stimu
Abstract

Cited by 1455 (2 self)
. These models represent objects as points in some coordinate space such that the observed dissimilarities between objects correspond to the metric distances between the respective points. Practically all analyses of proximity data have been metric in nature, although some (e.g., hierarchical clustering) yield
Approximate distance oracles
, 2004
Let G = (V, E) be an undirected weighted graph with V  = n and E  = m. Let k ≥ 1 be an integer. We show that G = (V, E) can be preprocessed in O(kmn 1/k) expected time, constructing a data structure of size O(kn 1+1/k), such that any subsequent distance query can be answered, approximately, in
Abstract

Cited by 273 (9 self)
, in O(k) time. The approximate distance returned is of stretch at most 2k − 1, i.e., the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and 2k−1. A 1963 girth conjecture of Erdős, implies that Ω(n 1+1/k) space is needed in the worst case for any real stretch
Blobworld: A System for RegionBased Image Indexing and Retrieval
 In Third International Conference on Visual Information Systems
, 1999
. Blobworld is a system for image retrieval based on finding coherent image regions which roughly correspond to objects. Each image is automatically segmented into regions ("blobs") with associated color and texture descriptors. Querying is based on the attributes of one or two regions of
Abstract

Cited by 375 (4 self)
of interest, rather than a description of the entire image. In order to make largescale retrieval feasible, we index the blob descriptions using a tree. Because indexing in the highdimensional feature space is computationally prohibitive, we use a lowerrank approximation to the highdimensional distance
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
 Machine Learning
, 1993
In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of t
Abstract

Cited by 309 (3 self)
of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce realvalued distances between instances, and attaches weights to the instances to further modify the structure of feature
Greedy spectral embedding
Spectral dimensionality reduction methods and spectral clustering methods require computation of the principal eigenvectors of an n × n matrix where n is the number of examples. Following up on previously proposed techniques to speedup kernel methods by focusing on a subset of m examples, we study
Abstract

Cited by 20 (2 self)
a greedy selection procedure for this subset, based on the featurespace distance between a candidate example and the span of the previously chosen ones. In the case of kernel PCA or spectral clustering this reduces computation to O(m² n). For the same computational complexity, we can also compute
Region Covariance: A Fast Descriptor for Detection And Classification
 In Proc. 9th European Conf. on Computer Vision
, 2006
We describe a new region descriptor and apply it to two problems, object detection and texture classification. The covariance of dfeatures, e.g., the threedimensional color vector, the norm of first and second derivatives of intensity with respect to x and y, etc., characterizes a region of in
Abstract

Cited by 278 (14 self)
. Covariance matrices do not lie on Euclidean space, therefore we use a distance metric involving generalized eigenvalues which also follows from the Lie group structure of positive definite matrices. Feature matching is a simple nearest neighbor search under the distance metric and performed extremely
Feature Selection via Concave Minimization and Support Vector Machines
 Machine Learning Proceedings of the Fifteenth International Conference(ICML ’98
, 1998
Computational comparison is made between two feature selection approaches for finding a separating plane that discriminates between two point sets in an ndimensional feature space that utilizes as few of the n features (dimensions) as possible. In the concave minimization approach [19, 5] a separat
Abstract

Cited by 263 (23 self)
Computational comparison is made between two feature selection approaches for finding a separating plane that discriminates between two point sets in an ndimensional feature space that utilizes as few of the n features (dimensions) as possible. In the concave minimization approach [19, 5] a
