Results 1  10
of
603,156
Distance Metric Learning, With Application To Clustering With SideInformation
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 15
, 2003
"... Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may be for the us ..."
Abstract

Cited by 799 (14 self)
 Add to MetaCart
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may
An extensive empirical study of feature selection metrics for text classification
 J. of Machine Learning Research
, 2003
"... Machine learning for text classification is the cornerstone of document categorization, news filtering, document routing, and personalization. In text domains, effective feature selection is essential to make the learning task efficient and more accurate. This paper presents an empirical comparison ..."
Abstract

Cited by 483 (15 self)
 Add to MetaCart
choice for all goals except precision, for which Information Gain yielded the best result most often. This analysis also revealed, for example, that Information Gain and ChiSquared have correlated failures, and so they work poorly together. When choosing optimal pairs of metrics for each of the four
Mtree: An Efficient Access Method for Similarity Search in Metric Spaces
, 1997
"... A new access meth d, called Mtree, is proposed to organize and search large data sets from a generic "metric space", i.e. whE4 object proximity is only defined by a distance function satisfyingth positivity, symmetry, and triangle inequality postulates. We detail algorith[ for insertion o ..."
Abstract

Cited by 652 (38 self)
 Add to MetaCart
A new access meth d, called Mtree, is proposed to organize and search large data sets from a generic "metric space", i.e. whE4 object proximity is only defined by a distance function satisfyingth positivity, symmetry, and triangle inequality postulates. We detail algorith[ for insertion
A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS
, 2005
"... In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the HarrisAffine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perfo ..."
Abstract

Cited by 1752 (53 self)
 Add to MetaCart
their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations
Solving multiclass learning problems via errorcorrecting output codes
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1995
"... Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass l ..."
Abstract

Cited by 730 (8 self)
 Add to MetaCart
Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decisiontree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of over tting avoidance techniques such as decisiontree pruning. Finally,we show thatlike the other methodsthe errorcorrecting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that errorcorrecting output codes provide a generalpurpose method for improving the performance of inductive learning programs on multiclass problems.
Functional discovery via a compendium of expression profiles. Cell 102:109
, 2000
"... have been devised to survey gene functions en masse either computationally (Marcotte et al., 1999) or experimentally; among these, highly parallel assays of ..."
Abstract

Cited by 537 (8 self)
 Add to MetaCart
have been devised to survey gene functions en masse either computationally (Marcotte et al., 1999) or experimentally; among these, highly parallel assays of
Evaluating collaborative filtering recommender systems
 ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2004
"... ..."
Convex Analysis
, 1970
"... In this book we aim to present, in a unified framework, a broad spectrum of mathematical theory that has grown in connection with the study of problems of optimization, equilibrium, control, and stability of linear and nonlinear systems. The title Variational Analysis reflects this breadth. For a lo ..."
Abstract

Cited by 5350 (67 self)
 Add to MetaCart
In this book we aim to present, in a unified framework, a broad spectrum of mathematical theory that has grown in connection with the study of problems of optimization, equilibrium, control, and stability of linear and nonlinear systems. The title Variational Analysis reflects this breadth. For a
BLEU: a Method for Automatic Evaluation of Machine Translation
, 2002
"... Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused. ..."
Abstract

Cited by 2107 (4 self)
 Add to MetaCart
Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused.
Computing semantic relatedness using Wikipediabased explicit semantic analysis
 In Proceedings of the 20th International Joint Conference on Artificial Intelligence
, 2007
"... Computing semantic relatedness of natural language texts requires access to vast amounts of commonsense and domainspecific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a highdimensional space of concepts derived from Wikipedi ..."
Abstract

Cited by 546 (9 self)
 Add to MetaCart
Computing semantic relatedness of natural language texts requires access to vast amounts of commonsense and domainspecific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a highdimensional space of concepts derived from
Results 1  10
of
603,156