Results 1  10
of
16
Efficient Distributionfree Learning of Probabilistic Concepts
 Journal of Computer and System Sciences
, 1993
"... In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic c ..."
Abstract

Cited by 197 (8 self)
 Add to MetaCart
In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic concepts (or pconcepts) may arise in situations such as weather prediction, where the measured variables and their accuracy are insufficient to determine the outcome with certainty. We adopt from the Valiant model of learning [27] the demands that learning algorithms be efficient and general in the sense that they perform well for a wide class of pconcepts and for any distribution over the domain. In addition to giving many efficient algorithms for learning natural classes of pconcepts, we study and develop in detail an underlying theory of learning pconcepts. 1 Introduction Consider the following scenarios: A meteorologist is attempting to predict tomorrow's weather as accurately as pos...
Kernel matching pursuit
 Machine Learning
, 2002
"... Matching Pursuit algorithms learn a function that is a weighted sum of basis functions, by sequentially appending functions to an initially empty basis, to approximate a target function in the leastsquares sense. We show how matching pursuit can be extended to use nonsquared error loss functions, a ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
Matching Pursuit algorithms learn a function that is a weighted sum of basis functions, by sequentially appending functions to an initially empty basis, to approximate a target function in the leastsquares sense. We show how matching pursuit can be extended to use nonsquared error loss functions, and how it can be used to build kernelbased solutions to machinelearning problems, while keeping control of the sparsity of the solution. We also derive MDL motivated generalization bounds for this type of algorithm, and compare them to related SVM (Support Vector Machine) bounds. Finally, links to boosting algorithms and RBF training procedures, as well as an extensive experimental comparison with SVMs for classification are given, showing comparable results with typically sparser models. 1
Sample compression, learnability, and the VapnikChervonenkis dimension
 MACHINE LEARNING
, 1995
"... Within the framework of paclearning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function r ..."
Abstract

Cited by 61 (3 self)
 Add to MetaCart
Within the framework of paclearning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixedsize for a class C is sufficient to ensure that the class C is paclearnable. Previous work has shown that a class is paclearnable if and only if the VapnikChervonenkis (VC) dimension of the class i...
A compression approach to support vector model selection
 Journal of Machine Learning Research
, 2004
"... This report is available in PDF–format via anonymous ftp at ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
This report is available in PDF–format via anonymous ftp at
Data driven online to batch conversions
 In Advances in Neural Information Processing Systems 18 (Proceedings of NIPS
, 2005
"... Online learning algorithms are typically fast, memory efficient, and simple to implement. However, many common learning problems fit more naturally in the batch learning setting. The power of online learning algorithms can be exploited in batch settings by using onlinetobatch conversions technique ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Online learning algorithms are typically fast, memory efficient, and simple to implement. However, many common learning problems fit more naturally in the batch learning setting. The power of online learning algorithms can be exploited in batch settings by using onlinetobatch conversions techniques which build a new batch algorithm from an existing online algorithm. We first give a unified overview of three existing onlinetobatch conversion techniques which do not use training data in the conversion process. We then build upon these dataindependent conversions to derive and analyze datadriven conversions. Our conversions find hypotheses with a small risk by explicitly minimizing datadependent generalization bounds. We experimentally demonstrate the usefulness of our approach and in particular show that the datadriven conversions consistently outperform the dataindependent conversions. 1
Unlabeled compression schemes for maximum classes
 Journal of Machine Learning Research
, 2006
"... Abstract. We give a compression scheme for any maximum class of VC dimension d that compresses any sample consistent with a concept in the class to at most d unlabeled points from the domain of the sample. 1 ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Abstract. We give a compression scheme for any maximum class of VC dimension d that compresses any sample consistent with a concept in the class to at most d unlabeled points from the domain of the sample. 1
Random subclass bounds
 In Proceedings of the 16th Annual Conference on Computational Learning Theory (COLT
, 2003
"... Abstract. It has been recently shown that sharp generalization bounds can be obtained when the function class from which the algorithm chooses its hypotheses is “small ” in the sense that the Rademacher averages of this function class are small [8, 9]. Seemingly based on different arguments, general ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Abstract. It has been recently shown that sharp generalization bounds can be obtained when the function class from which the algorithm chooses its hypotheses is “small ” in the sense that the Rademacher averages of this function class are small [8, 9]. Seemingly based on different arguments, generalization bounds were obtained in the compression scheme [7], luckiness [13], and algorithmic luckiness [6] frameworks in which the “size ” of the function class is not specified a priori. We show that the bounds obtained in all these frameworks follow from the same general principle, namely that coordinate projections of this function subclass evaluated on random samples are “small ” with high probability.
Shifting: OneInclusion Mistake Bounds and Sample Compression
 EECS DEPARTMENT, UNIVERSITY OF CALIFORNIA, BERKELEY
, 2007
"... ..."
A geometric approach to sample compression
"... The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that canno ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations. This paper presents positive results on the latter approach and a first negative result on the former, through a systematic investigation of finite maximum classes. Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the oneinclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. A bijection between finite maximum classes and certain arrangements of piecewiselinear (PL) hyperplanes in either a ball or Euclidean space is established. Finally we show that dmaximum classes corresponding to PLhyperplane
Faithful Representations and Moments of Satisfaction: Probabilistic Methods in Learning and Logic
, 1998
"... ii To my wife, Ma'ayan, and my daughter, Shira. iii Acknowledgments Special thanks are due to: ffl Prof. Naftali Tishby for his help and guidance in carrying out this study, for the many fascinating discussions we had, and for the immense body of knowledge that I have absorbed from him during my stu ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
ii To my wife, Ma'ayan, and my daughter, Shira. iii Acknowledgments Special thanks are due to: ffl Prof. Naftali Tishby for his help and guidance in carrying out this study, for the many fascinating discussions we had, and for the immense body of knowledge that I have absorbed from him during my studies.