Results 1  10
of
33
Efficient Distributionfree Learning of Probabilistic Concepts
 Journal of Computer and System Sciences
, 1993
"... In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic c ..."
Abstract

Cited by 212 (8 self)
 Add to MetaCart
(Show Context)
In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behaviorthus, the same input may sometimes be classified as a positive example and sometimes as a negative example. Such probabilistic concepts (or pconcepts) may arise in situations such as weather prediction, where the measured variables and their accuracy are insufficient to determine the outcome with certainty. We adopt from the Valiant model of learning [27] the demands that learning algorithms be efficient and general in the sense that they perform well for a wide class of pconcepts and for any distribution over the domain. In addition to giving many efficient algorithms for learning natural classes of pconcepts, we study and develop in detail an underlying theory of learning pconcepts. 1 Introduction Consider the following scenarios: A meteorologist is attempting to predict tomorrow's weather as accurately as pos...
Some PACBayesian Theorems
 Machine Learning
, 1998
"... This paper gives PAC guarantees for "Bayesian" algorithms  algorithms that optimize risk minimization expressions involving a prior probability and a likelihood for the training data. PACBayesian algorithms are motivated by a desire to provide an informative prior encoding informat ..."
Abstract

Cited by 141 (4 self)
 Add to MetaCart
(Show Context)
This paper gives PAC guarantees for "Bayesian" algorithms  algorithms that optimize risk minimization expressions involving a prior probability and a likelihood for the training data. PACBayesian algorithms are motivated by a desire to provide an informative prior encoding information about the expected experimental setting but still having PAC performance guarantees over all IID settings. The PACBayesian theorems given here apply to an arbitrary prior measure on an arbitrary concept space. These theorems provide an alternative to the use of VC dimension in proving PAC bounds for parameterized concepts. 1 INTRODUCTION Much of modern learning theory can be divided into two seemingly separate areas  Bayesian inference and PAC learning. Both areas study learning algorithms which take as input training data and produce as output a concept or model which can then be tested on test data. In both areas learning algorithms are associated with correctness theorems. PAC correct...
On the Complexity of Teaching
 Journal of Computer and System Sciences
, 1992
"... While most theoretical work in machine learning has focused on the complexity of learning, recently there has been increasing interest in formally studying the complexity of teaching . In this paper we study the complexity of teaching by considering a variant of the online learning model in which a ..."
Abstract

Cited by 115 (2 self)
 Add to MetaCart
(Show Context)
While most theoretical work in machine learning has focused on the complexity of learning, recently there has been increasing interest in formally studying the complexity of teaching . In this paper we study the complexity of teaching by considering a variant of the online learning model in which a helpful teacher selects the instances. We measure the complexity of teaching a concept from a given concept class by a combinatorial measure we call the teaching dimension. Informally, the teaching dimension of a concept class is the minimum number of instances a teacher must reveal to uniquely identify any target concept chosen from the class. A preliminary version of this paper appeared in the Proceedings of the Fourth Annual Workshop on Computational Learning Theory, pages 303314. August 1991. Most of this research was carried out while both authors were at MIT Laboratory for Computer Science with support provided by ARO Grant DAAL0386K0171, DARPA Contract N0001489J1988, NSF Gr...
PACBayesian stochastic model selection
 Machine Learning
, 2003
"... Abstract PACBayesian learning methods combine the informative priors of Bayesian methods with distributionfree PAC guarantees. Stochastic model selection predicts a class label by stochastically sampling a classifier according to a "posterior distribution " on classifiers. This p ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
(Show Context)
Abstract PACBayesian learning methods combine the informative priors of Bayesian methods with distributionfree PAC guarantees. Stochastic model selection predicts a class label by stochastically sampling a classifier according to a &quot;posterior distribution &quot; on classifiers. This paper gives a PACBayesian performance guarantee for stochastic model selection that is superior to analogous guarantees for deterministic model selection. The guarantee is stated in terms of the training error of the stochastic classifier and the KLdivergence of the posterior from the prior. It is shown that the posterior optimizing the performance guarantee is a Gibbs distribution. Simpler posterior distributions are also derived that have nearly optimal performance guarantees.
Problems and results in extremal combinatorics  II
 DISCRETE MATHEMATICS
, 2003
"... Extremal Combinatorics is one of the central areas in Discrete Mathematics. It deals with problems that are often motivated by questions arising in other areas, including Theoretical Computer Science, Geometry and Game Theory. This paper contains a collection of problems and results in the area, inc ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
(Show Context)
Extremal Combinatorics is one of the central areas in Discrete Mathematics. It deals with problems that are often motivated by questions arising in other areas, including Theoretical Computer Science, Geometry and Game Theory. This paper contains a collection of problems and results in the area, including solutions or partial solutions to open problems suggested by various researchers. The topics considered here include questions in Extremal Graph Theory, Polyhedral Combinatorics and Probabilistic Combinatorics. This is not meant to be a comprehensive survey of the area, it is merely a collection of various extremal problems, which are hopefully interesting. The choice of the problems is inevitably biased, and as the title of the paper suggests, it is a sequel to a previous paper [2] of the same flavour, and hopefully a predecessor of another related future paper. Each section of this paper is essentially self contained, and can be read separately.
FROM FINDING MAXIMUM FEASIBLE SUBSYSTEMS OF LINEAR SYSTEMS TO FEEDFORWARD NEURAL NETWORK DESIGN
, 1994
"... ..."
A Framework for Structural Risk Minimisation
, 1996
"... The paper introduces a framework for studying structural risk minimisation. The model views structural risk minimisation in a PAC context. It then considers the more general case when the hierarchy of classes is chosen in response to the data. This theoretically explains the impressive performance o ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
(Show Context)
The paper introduces a framework for studying structural risk minimisation. The model views structural risk minimisation in a PAC context. It then considers the more general case when the hierarchy of classes is chosen in response to the data. This theoretically explains the impressive performance of the maximal margin hyperplane algorithm of Vapnik. It may also provide a general technique for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets.
Using Approximate Models as Source of Contextual Information for . . .
 In Proc. of the ICCV'95 Workshop on ContextBased Vision
, 1995
"... Most computer vision algorithms are based on strong assumptions about the objects and the actions depicted in the image. To safely apply those algorithms in real world image sequences, it is necessary to verify that their assumptions are satisfied in the context of the visual process. We propose the ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
Most computer vision algorithms are based on strong assumptions about the objects and the actions depicted in the image. To safely apply those algorithms in real world image sequences, it is necessary to verify that their assumptions are satisfied in the context of the visual process. We propose the use of approximate world models  coarse descriptions of objects and actions in the world  as the appropriate representation for contextual information. The approximate world models are employed to verify the applicability of a vision routine in a given situation. Under these conditions, a task module can reliably use the outputs of the contextuallysafe vision routines, without having to refer to an accurate reconstruction of the world. We are using approximate world models in a project to control cameras in a TV studio. In our Intelligent Studio automatic cameras respond to verbal requests for shots from the TV director. Contextual information is obtained from the script of the TV sho...
The Complexity of Theory Revision
 In Proceedings of IJCAI95
, 1998
"... A knowledgebased system uses its database (a.k.a. its "theory") to produce answers to the queries it receives. Unfortunately, these answers may be incorrect if the underlying theory is faulty. Standard "theory revision" systems use a given set of "labeled queries" (eac ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
A knowledgebased system uses its database (a.k.a. its "theory") to produce answers to the queries it receives. Unfortunately, these answers may be incorrect if the underlying theory is faulty. Standard "theory revision" systems use a given set of "labeled queries" (each a query paired with its correct answer) to transform the given theory, by adding and/or deleting either rules and/or antecedents, into a related theory that is as accurate as possible. After formally defining the theory revision task, this paper provides both sample and computational complexity bounds for this process. It first specifies the number of labeled queries necessary to identify a revised theory whose error is close to minimal with high probability. It then considers the computational complexity of finding this best theory, and proves that, unless P = NP , no polynomial time algorithm can identify this nearoptimal revision, even given the exact distribution of queries, except in certain simple situation. It ...
Sequential PAC Learning
 In Proceedigs of COLT95
, 1995
"... We consider the use of "online" stopping rules to reduce the number of training examples needed to paclearn. Rather than collect a large training sample that can be proved sufficient to eliminate all bad hypotheses a priori, the idea is instead to observe training examples oneatatime ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
We consider the use of "online" stopping rules to reduce the number of training examples needed to paclearn. Rather than collect a large training sample that can be proved sufficient to eliminate all bad hypotheses a priori, the idea is instead to observe training examples oneatatime and decide "online" whether to stop and return a hypothesis, or continue training. The primary benefit of this approach is that we can detect when a hypothesizer has actually "converged," and halt training before the standard fixedsamplesize bounds. This paper presents a series of such sequential learning procedures for: distributionfree paclearning, "mistakebounded to pac" conversion, and distributionspecific paclearning, respectively. We analyze the worst case expected training sample size of these procedures, and show that this is often smaller than existing fixed sample size bounds  while providing the exact same worst case pacguarantees. We also provide lower bounds that show these r...