Results 1 
6 of
6
Efficient noisetolerant learning from statistical queries
 JOURNAL OF THE ACM
, 1998
"... In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from stat ..."
Abstract

Cited by 357 (5 self)
 Add to MetaCart
(Show Context)
In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from statistical queries. Intuitively, in this model, a learning algorithm is forbidden to examine individual examples of the unknown target function, but is given access to an oracle providing estimates of probabilities over the sample space of random examples. One of our main results shows that any class of functions learnable from statistical queries is in fact learnable with classification noise in Valiant’s model, with a noise rate approaching the informationtheoretic barrier of 1/2. We then demonstrate the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learned in the presence of noise). A notable exception to this statement is the class of parity functions, which we prove is not learnable from statistical queries, and for which no noisetolerant algorithm is known.
Efficient Distributionfree Learning of Probabilistic Concepts
 Journal of Computer and System Sciences
, 1993
"... In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic c ..."
Abstract

Cited by 212 (8 self)
 Add to MetaCart
(Show Context)
In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic concepts (or pconcepts) may arise in situations such as weather prediction, where the measured variables and their accuracy are insufficient to determine the outcome with certainty. We adopt from the Valiant model of learning [27] the demands that learning algorithms be efficient and general in the sense that they perform well for a wide class of pconcepts and for any distribution over the domain. In addition to giving many efficient algorithms for learning natural classes of pconcepts, we study and develop in detail an underlying theory of learning pconcepts. 1 Introduction Consider the following scenarios: A meteorologist is attempting to predict tomorrow's weather as accurately as pos...
General Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Boosting
, 1993
"... We derive general bounds on the complexity oflearning in the Statistical Query model and in the ..."
Abstract

Cited by 46 (5 self)
 Add to MetaCart
We derive general bounds on the complexity oflearning in the Statistical Query model and in the
Improved NoiseTolerant Learning and Generalized Statistical Queries
, 1994
"... The statistical query learning model can be viewed as a tool for creating (or demonstrating the existence of) noisetolerant learning algorithms in the PAC model. The complexity of a statistical query algorithm, in conjunction with the complexity of simulating SQ algorithms in the PAC model with n ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
The statistical query learning model can be viewed as a tool for creating (or demonstrating the existence of) noisetolerant learning algorithms in the PAC model. The complexity of a statistical query algorithm, in conjunction with the complexity of simulating SQ algorithms in the PAC model with noise, determine the complexity of the noisetolerant PAC algorithms produced. Although roughly optimal upper bounds have been shown for the complexity of statistical query learning, the corresponding noisetolerant PAC algorithms are not optimal due to inefficient simulations. In this paper we provide both improved simulations and a new variant of the statistical query model in order to overcome these inefficiencies. We improve the time complexity of the classification noise simulation of statistical query algorithms. Our new simulation has a roughly optimal dependence on the noise rate. We also derive a simpler proof that statistical queries can be simulated in the presence of classification noise. This proof makes fewer assumptions on the queries themselves and therefore allows one to simulate more general types of queries. We also define a new variant of the statistical query model based on relative error, and we show that this variant is more natural and strictly more powerful than the standard additive error model. We demonstrate efficient PAC simulations for algorithms in this new model and give general upper bounds on both learning with relative error statistical queries and PAC simulation. We show that any statistical query algorithm can be simulated in the PAC model with malicious errors in such a way that the resultant PAC algorithm has a roughly optimal tolerable malicious error rate and sample complexity.
Efficient Learning from Faulty Data
, 1995
"... Learning systems are often provided with imperfect or noisy data. Therefore, researchers have formalized various models of learning with noisy data, and have attempted to delineate the boundaries of learnability in these models. In this thesis, we describe a general framework for the construction of ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Learning systems are often provided with imperfect or noisy data. Therefore, researchers have formalized various models of learning with noisy data, and have attempted to delineate the boundaries of learnability in these models. In this thesis, we describe a general framework for the construction of efficient learning algorithms in noise tolerant variants of Valiant's PAC learning model. By applying this framework, we also obtain many new results for specific learning problems in various settings with faulty data. The central tool used in this thesis is the specification of learning algorithms in Kearns' Statistical Query (SQ) learning model, in which statistics, as opposed to labelled examples, are requested by the learner. These SQ learning algorithms are then converted into PAC algorithms which tolerate various types of faulty data. We develop this framework in three major parts: 1. We design automatic compilations of SQ algorithms into PAC algorithms which tolerate various types of data errors. These results include improvements to Kearns' classification noise compilation, and the first such compilations for malicious errors, attribute noise and new classes of &quot;hybrid &quot; noise composed of multiple noise types. 2. We prove nearly tight bounds on the required complexity of SQ algorithms. The upper bounds are based on a constructive technique which allows one to achieve this complexity even when it is not initially achieved by a given SQ algorithm. 3. We define and employ an improved model of SQ learning which yields noise tolerant PAC algorithms that are more efficient than those derived from standard SQ algorithms. Together, these results provide a unified and intuitive framework for noise tolerant learning that allows the algorithm designer to achieve efficient, and often optimal, fault tolerant learning.