Results 1  10
of
1,065
Learning kernelbased halfspaces with the zeroone loss.
 In COLT,
, 2010
"... Abstract We describe and analyze a new algorithm for agnostically learning kernelbased halfspaces with respect to the zeroone loss function. Unlike most previous formulations which rely on surrogate convex loss functions (e.g. hingeloss in SVM and logloss in logistic regression), we provide fin ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract We describe and analyze a new algorithm for agnostically learning kernelbased halfspaces with respect to the zeroone loss function. Unlike most previous formulations which rely on surrogate convex loss functions (e.g. hingeloss in SVM and logloss in logistic regression), we provide
On the optimality of the simple Bayesian classifier under zeroone loss
 MACHINE LEARNING
, 1997
"... The simple Bayesian classifier is known to be optimal when attributes are independent given the class, but the question of whether other sufficient conditions for its optimality exist has so far not been explored. Empirical results showing that it performs surprisingly well in many domains containin ..."
Abstract

Cited by 818 (27 self)
 Add to MetaCart
containing clear attribute dependences suggest that the answer to this question may be positive. This article shows that, although the Bayesian classifier’s probability estimates are only optimal under quadratic loss if the independence assumption holds, the classifier itself can be optimal under zeroone
Minimax Rules Under ZeroOne Loss for a Restricted Location Parameter
 Journal of Statistical Planning and Inference
, 1997
"... Minimax Rules Under ZeroOne Loss In this paper we study the existence, structure and computation of minimax and nearminimax rules under zeroone loss for a restricted location parameter of an absolutely continuous distribution. These rules are the basis of a novel approach to pose estimation in t ..."
Abstract
 Add to MetaCart
Minimax Rules Under ZeroOne Loss In this paper we study the existence, structure and computation of minimax and nearminimax rules under zeroone loss for a restricted location parameter of an absolutely continuous distribution. These rules are the basis of a novel approach to pose estimation
Minimax Rules Under ZeroOne Loss for a Restricted Location Parameter
 Journal of Statistical Planning and Inference
, 1997
"... Minimax Rules Under ZeroOne Loss In this paper, we obtain minimax and nearminimax nonrandomized decision rules under zeroone loss for a restricted location parameter of an absolutely continuous distribution. Two types of rules are addressed: monotone and nonmonotone. A completeclass theorem is p ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
Minimax Rules Under ZeroOne Loss In this paper, we obtain minimax and nearminimax nonrandomized decision rules under zeroone loss for a restricted location parameter of an absolutely continuous distribution. Two types of rules are addressed: monotone and nonmonotone. A completeclass theorem
Bias plus variance decomposition for zeroone loss functions
 In Machine Learning: Proceedings of the Thirteenth International Conference
, 1996
"... We present a biasvariance decomposition of expected misclassi cation rate, the most commonly used loss function in supervised classi cation learning. The biasvariance decomposition for quadratic loss functions is well known and serves as an important tool for analyzing learning algorithms, yet no ..."
Abstract

Cited by 212 (5 self)
 Add to MetaCart
no decomposition was o ered for the more commonly used zeroone (misclassi cation) loss functions until the recent work of Kong & Dietterich (1995) and Breiman (1996). Their decomposition su ers from some major shortcomings though (e.g., potentially negative variance), which our decomposition avoids. We show
Learning halfspaces with the zeroone loss: Timeaccuracy tradeoffs.
 In NIPS,
, 2012
"... Abstract Given α, ϵ, we study the time complexity required to improperly learn a halfspace with misclassification error rate of at most (1 + α) L * γ + ϵ, where L * γ is the optimal γmargin error rate. For α = 1/γ, polynomial time and sample complexity is achievable using the hingeloss. For α = 0 ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract Given α, ϵ, we study the time complexity required to improperly learn a halfspace with misclassification error rate of at most (1 + α) L * γ + ϵ, where L * γ is the optimal γmargin error rate. For α = 1/γ, polynomial time and sample complexity is achievable using the hingeloss. For α
A ROBUST ENSEMBLE LEARNING USING ZEROONE LOSS FUNCTION
, 2006
"... Abstract Classifier is used for pattern recognition in various fields including data mining. Boosting is an ensemble learning method to boost (enhance) an accuracy of single classifier. We propose a new, robust boosting method by using a zeroone step function as a loss function. In deriving the met ..."
Abstract
 Add to MetaCart
Abstract Classifier is used for pattern recognition in various fields including data mining. Boosting is an ensemble learning method to boost (enhance) an accuracy of single classifier. We propose a new, robust boosting method by using a zeroone step function as a loss function. In deriving
BiasVariance Decomposition of ZeroOne Loss in AverageCase Model
 Proc. AMAI
, 2002
"... this paper, we also consider that Fig.1(a) and (b) are regarded as the same case, and Fig.1(c) and (d) are regarded the same case. In other words, we consider the gross variance V gross = V u +V b . For preparation of discussion, we show Theorem 5 ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
this paper, we also consider that Fig.1(a) and (b) are regarded as the same case, and Fig.1(c) and (d) are regarded the same case. In other words, we consider the gross variance V gross = V u +V b . For preparation of discussion, we show Theorem 5
Regret analysis for performance metrics in multilabel classification: The case of hamming and subset zeroone loss
 In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
"... Abstract. In multilabel classification (MLC), each instance is associated with a subset of labels instead of a single class, as in conventional classification, and this generalization enables the definition of a multitude of loss functions. Indeed, a large number of losses has already been proposed ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Abstract. In multilabel classification (MLC), each instance is associated with a subset of labels instead of a single class, as in conventional classification, and this generalization enables the definition of a multitude of loss functions. Indeed, a large number of losses has already been
A unified biasvariance decomposition for zeroone and squared loss
 In AAAI’00
"... The biasvariance decomposition is a very useful and widelyused tool for understanding machinelearning algorithms. It was originally developed for squared loss. In recent years, several authors have proposed decompositions for zeroone loss, but each has significant shortcomings. In particular, al ..."
Abstract

Cited by 57 (0 self)
 Add to MetaCart
The biasvariance decomposition is a very useful and widelyused tool for understanding machinelearning algorithms. It was originally developed for squared loss. In recent years, several authors have proposed decompositions for zeroone loss, but each has significant shortcomings. In particular
Results 1  10
of
1,065