Results 1 - 10
of
58
Scale-sensitive Dimensions, Uniform Convergence, and Learnability
, 1997
"... Learnability in Valiant's PAC learning model has been shown to be strongly related to the existence of uniform laws of large numbers. These laws define a distribution-free convergence property of means to expectations uniformly over classes of random variables. Classes of real-valued functions enjoy ..."
Abstract
-
Cited by 175 (1 self)
- Add to MetaCart
Learnability in Valiant's PAC learning model has been shown to be strongly related to the existence of uniform laws of large numbers. These laws define a distribution-free convergence property of means to expectations uniformly over classes of random variables. Classes of real-valued functions enjoying such a property are also known as uniform Glivenko-Cantelli classes. In this paper we prove, through a generalization of Sauer's lemma that may be interesting in its own right, a new characterization of uniform Glivenko-Cantelli classes. Our characterization yields Dudley, Gin'e, and Zinn's previous characterization as a corollary. Furthermore, it is the first based on a simple combinatorial quantity generalizing the Vapnik-Chervonenkis dimension. We apply this result to obtain the weakest combinatorial condition known to imply PAC learnability in the statistical regression (or "agnostic") framework. Furthermore, we show a characterization of learnability in the probabilistic concept model, solving an open problem posed by Kearns and Schapire. These results show that the accuracy parameter plays a crucial role in determining the effective complexity of the learner's hypothesis class.
A few notes on Statistical Learning Theory
, 2003
"... this article is on the theoretical side and not on the applicative one; hence, we shall not present examples which may be interesting from the practical point of view but have little theoretical significance. This survey is far from being complete and it focuses on problems the author finds interest ..."
Abstract
-
Cited by 41 (10 self)
- Add to MetaCart
this article is on the theoretical side and not on the applicative one; hence, we shall not present examples which may be interesting from the practical point of view but have little theoretical significance. This survey is far from being complete and it focuses on problems the author finds interesting (an opinion which is not necessarily shared by the majority of the learning community). Relevant books which present a more evenly balanced approach are, for example [1, 4, 35, 36] The starting point of our discussion is the formulation of the learning problem. Consider a class G, consisting of real valued functions defined on a space #, and assume that each g G maps # into [0, 1]. Let T be an unknown function, T : # [0, 1] and set to be an unknown probability measure on #
Clustering for Edge-Cost Minimization
"... Leonard J. Schulman College of Computing Georgia Institute of Technology Atlanta GA 30332-0280 ABSTRACT We address the problem of partitioning a set of n points into clusters, so as to minimize the sum, over all intracluster pairs of points, of the cost associated with each pair. We obtain a ra ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
Leonard J. Schulman College of Computing Georgia Institute of Technology Atlanta GA 30332-0280 ABSTRACT We address the problem of partitioning a set of n points into clusters, so as to minimize the sum, over all intracluster pairs of points, of the cost associated with each pair. We obtain a randomized approximation algorithm for this problem, for the cost functions ` 2 2 ; `1 and `2 , as well as any cost function isometrically embeddable in ` 2 2 .
Introduction to Statistical Learning Theory
- In , O. Bousquet, U.v. Luxburg, and G. Rsch (Editors
, 2004
"... ..."
On the Hardness of Being Truthful
- In 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS
, 2008
"... The central problem in computational mechanism design is the tension between incentive compatibility and computational ef ciency. We establish the rst significant approximability gap between algorithms that are both truthful and computationally-ef cient, and algorithms that only achieve one of these ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
The central problem in computational mechanism design is the tension between incentive compatibility and computational ef ciency. We establish the rst significant approximability gap between algorithms that are both truthful and computationally-ef cient, and algorithms that only achieve one of these two desiderata. This is shown in the context of a novel mechanism design problem which we call the COMBINATORIAL PUB-LIC PROJECT PROBLEM (CPPP). CPPP is an abstraction of many common mechanism design situations, ranging from elections of kibbutz committees to network design. Our result is actually made up of two complementary results – one in the communication-complexity model and one in the computational-complexity model. Both these hardness results heavily rely on a combinatorial characterization of truthful algorithms for our problem. Our computational-complexity result is one of the rst impossibility results connecting mechanism design to complexity theory; its novel proof technique involves an application of the Sauer-Shelah Lemma and may be of wider applicability, both within and without mechanism design. 1
Projections of bodies and hereditary properties of hypergraphs
- Bull. London Math. Soc
, 1995
"... We prove that for every M-dimensional body K, there is a rectangular parallelepiped B of the same volume as K, such that the projection of B onto any coordinate subspace is at most as large as that of the corresponding projection of K. We apply this theorem to projections of finite set systems and t ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
We prove that for every M-dimensional body K, there is a rectangular parallelepiped B of the same volume as K, such that the projection of B onto any coordinate subspace is at most as large as that of the corresponding projection of K. We apply this theorem to projections of finite set systems and to hereditary properties. In particular, we show that every hereditary property of uniform hypergraphs has a limiting density. 1. Projections of bodies Let AT be a body in U n, and let (u19..., vn) be the standard basis for IR n. Denote the volume of K by \K\. Furthermore, given a subset A e [n] — {1,2,...,«} with d elements, denote by KA the orthogonal projection of K onto the subspace spanned by {vt'. ieA}, and by \KA \ its (//-dimensional) volume. Thus KM = K. By the term box we shall mean a rectangular parallelepiped whose sides are parallel to the coordinate axes. For the purposes of this paper, a body can be taken to be a compact subset of U n which is the closure of its interior. It would be effortless to rewrite our results and
On the complexity of approximating the vc dimension
- J. Comput. Syst. Sci
, 2001
"... We study the complexity of approximating the VC dimension of a collection of sets, when the sets are encoded succinctly by a small circuit. We show that this problem is • Σ p 3-hard to approximate to within a factor 2 − ɛ for any ɛ> 0, • approximable in AM to within a factor 2, and • AM-hard to appr ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
We study the complexity of approximating the VC dimension of a collection of sets, when the sets are encoded succinctly by a small circuit. We show that this problem is • Σ p 3-hard to approximate to within a factor 2 − ɛ for any ɛ> 0, • approximable in AM to within a factor 2, and • AM-hard to approximate to within a factor N ɛ for some constant ɛ> 0. To obtain the Σ p 3-hardness result we solve a randomness extraction problem using list-decodable binary codes; for the positive result we utilize the Sauer-Shelah(-Perles) Lemma. The exact value of ɛ in the AM-hardness result depends on the degree achievable by explicit disperser constructions. 1.
Probabilistic Analysis of Learning in Artificial Neural Networks: The PAC Model and its Variants
, 1994
"... There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models, much-studied since the introduction of the basic PAC model ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models, much-studied since the introduction of the basic PAC model by Valiant in 1984, provide a probabilistic framework for the discussion of generalization and learning. CONTENTS 3 Contents 1 Introduction 4 2 The Basic PAC Model of Learning 5 3 VC-Dimension and Growth Function 8 4 VC-Dimension and Linear Dimension 10 5 A Useful Probability Theorem 12 6 PAC Learning and the VC-Dimension 16 7 VC-Dimension of Binary-Output Networks 19 7.1 Introduction 19 7.2 Linearly weighted neural networks 21 7.3 Linear threshold networks 22 7.4 Other activation functions 26 7.5 The effect of weight restrictions 29 8 Computational Complexity of Learning 30 9 Stochastic Concepts 36 10 Distribution-Specific Learning 39 11 Graph Dimension and Multiple-Output Nets 42 11.1 T...
Quantifying the Amount of Verboseness
, 1995
"... We study the fine structure of the classification of sets of natural numbers A according to the number of queries which are needed to compute the n-fold characteristic function of A. A complete characterization is obtained, relating the question to finite combinatorics. In order to obtain an explic ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
We study the fine structure of the classification of sets of natural numbers A according to the number of queries which are needed to compute the n-fold characteristic function of A. A complete characterization is obtained, relating the question to finite combinatorics. In order to obtain an explicit description we consider several interesting combinatorial problems. 1 Introduction In the theory of bounded queries, we measure the complexity of a function by the number of queries to an oracle which are needed to compute it. The field has developed in various directions, both in complexity theory and in recursion theory; see Gasarch [21] for a recent survey. One of the original concerns is the classification of sets A of natural numbers by their "query complexity," i.e., according to the number of oracle queries that are needed to compute the n-fold characteristic function F A n = x 1 ; : : : ; x n : (ØA (x 1 ); : : : ; ØA (x n )). In [3, 8] a set A is called verbose iff F A n is com...
VC Dimension of Neural Networks
- Neural Networks and Machine Learning
, 1998
"... . This paper presents a brief introduction to Vapnik-Chervonenkis (VC) dimension, a quantity which characterizes the difficulty of distribution-independent learning. The paper establishes various elementary results, and discusses how to estimate the VC dimension in several examples of interest in ne ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
. This paper presents a brief introduction to Vapnik-Chervonenkis (VC) dimension, a quantity which characterizes the difficulty of distribution-independent learning. The paper establishes various elementary results, and discusses how to estimate the VC dimension in several examples of interest in neural network theory. 1 Introduction In this expository paper, we present a brief introduction to the subject of computing and estimating the VC dimension of neural network architectures. We provide precise definitions and prove several basic results, discussing also how one estimates VC dimension in several examples of interest in neural network theory. We do not address the learning and estimation-theoretic applications of VC dimension. (Roughly, the VC dimension is a number which helps to quantify the difficulty when learning from examples. The sample complexity, that is, the number of "learning instances" that one must be exposed to, in order to be reasonably certain to derive accurate p...

