Results 1  10
of
15,626
The use of the area under the ROC curve in the evaluation of machine learning algorithms
 PATTERN RECOGNITION
, 1997
"... In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multilayer Perceptron, kNe ..."
Abstract

Cited by 685 (3 self)
 Add to MetaCart
sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall
On the optimality of the simple Bayesian classifier under zeroone loss
 MACHINE LEARNING
, 1997
"... The simple Bayesian classifier is known to be optimal when attributes are independent given the class, but the question of whether other sufficient conditions for its optimality exist has so far not been explored. Empirical results showing that it performs surprisingly well in many domains containin ..."
Abstract

Cited by 818 (27 self)
 Add to MetaCart
The simple Bayesian classifier is known to be optimal when attributes are independent given the class, but the question of whether other sufficient conditions for its optimality exist has so far not been explored. Empirical results showing that it performs surprisingly well in many domains
Divergence measures based on the Shannon entropy
 IEEE Transactions on Information theory
, 1991
"... AbstractA new class of informationtheoretic divergence measures based on the Shannon entropy is introduced. Unlike the wellknown Kullback divergences, the new measures do not require the condition of absolute continuity to be satisfied by the probability distributions involved. More importantly, ..."
Abstract

Cited by 666 (0 self)
 Add to MetaCart
AbstractA new class of informationtheoretic divergence measures based on the Shannon entropy is introduced. Unlike the wellknown Kullback divergences, the new measures do not require the condition of absolute continuity to be satisfied by the probability distributions involved. More importantly
No Free Lunch Theorems for Optimization
, 1997
"... A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performan ..."
Abstract

Cited by 961 (10 self)
 Add to MetaCart
A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 783 (29 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We
Raptor codes
 IEEE Transactions on Information Theory
, 2006
"... LTCodes are a new class of codes introduced in [1] for the purpose of scalable and faulttolerant distribution of data over computer networks. In this paper we introduce Raptor Codes, an extension of LTCodes with linear time encoding and decoding. We will exhibit a class of universal Raptor codes: ..."
Abstract

Cited by 577 (7 self)
 Add to MetaCart
: for a given integer k, and any real ε> 0, Raptor codes in this class produce a potentially infinite stream of symbols such that any subset of symbols of size k(1 + ε) is sufficient to recover the original k symbols with high probability. Each output symbol is generated using O(log(1/ε)) operations
Maximum Likelihood Phylogenetic Estimation from DNA Sequences with Variable Rates over Sites: Approximate Methods
 J. Mol. Evol
, 1994
"... Two approximate methods are proposed for maximum likelihood phylogenetic estimation, which allow variable rates of substitution across nucleotide sites. Three data sets with quite different characteristics were analyzed to examine empirically the performance of these methods. The first, called ..."
Abstract

Cited by 557 (29 self)
 Add to MetaCart
the "discrete gamma model," uses several categories of rates to approximate the gamma distribution, with equal probability for each category. The mean of each category is used to represent all the rates falling in the category. The performance of this method is found to be quite good
Proof verification and hardness of approximation problems
 IN PROC. 33RD ANN. IEEE SYMP. ON FOUND. OF COMP. SCI
, 1992
"... We show that every language in NP has a probablistic verifier that checks membership proofs for it using logarithmic number of random bits and by examining a constant number of bits in the proof. If a string is in the language, then there exists a proof such that the verifier accepts with probabilit ..."
Abstract

Cited by 797 (39 self)
 Add to MetaCart
with probability 1 (i.e., for every choice of its random string). For strings not in the language, the verifier rejects every provided “proof " with probability at least 1/2. Our result builds upon and improves a recent result of Arora and Safra [6] whose verifiers examine a nonconstant number of bits
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 572 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family
Distributional Clustering Of English Words
 In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics
, 1993
"... We describe and evaluate experimentally a method for clustering words according to their dis tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the si ..."
Abstract

Cited by 629 (27 self)
 Add to MetaCart
as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest
Results 1  10
of
15,626