## Noise-Tolerant Distribution-Free Learning of General Geometric Concepts (1996)

### Cached

### Download Links

- [grouchy.cs.indiana.edu]
- [siesta.cs.wustl.edu]
- [pages.cpsc.ucalgary.ca]
- DBLP

### Other Repositories/Bibliography

Citations: | 17 - 3 self |

### BibTeX

@MISC{Bshouty96noise-tolerantdistribution-free,

author = {Nader H. Bshouty and Sally A. Goldman and H. David Mathias and Subhash Suri and Hisao Tamaki},

title = {Noise-Tolerant Distribution-Free Learning of General Geometric Concepts},

year = {1996}

}

### OpenURL

### Abstract

this paper. First, we give an algorithm to learn C

### Citations

1732 | A theory of the learnable
- Valiant
- 1984
(Show Context)
Citation Context ... an alternative noise-tolerant algorithm for d = 2 based on geometric subdivisions. 3 Preliminaries The learning model we use in this work is the probably approximately correct (PAC) model of Valiant =-=[31]-=-. In this model, the learner is presented with examples, chosen randomly from instance space X according to unknown probability distribution D. Let f be an unknown target function from known concept c... |

1715 | The Probabilistic Method - Alon, Spencer - 1992 |

959 |
On the uniform convergence of relative frequencies of events to their probabilities
- Vapnik, Chervonenkis
- 1971
(Show Context)
Citation Context ...e. The paper of Blumer et al. [13] identifies a combinatorial parameter of a class of hypotheses called the VapnikChervonenkis (VC) dimension, which originated in the paper of Vapnik and Chervonenkis =-=[33]-=-, that bounds how large a sample size is required in order to have enough information for accurate generalization. The VC dimension of concept class C (which we denote VCD(C) is the size of a largest ... |

697 |
Algorithms in combinatorial geometry
- Edelsbrunner
- 1987
(Show Context)
Citation Context ...ne in the target concept. For this purpose, we pick one representative hyperplane from each equivalence class. To achieve this goal we use a geometric duality argument (see, for example, Edelsbrunner =-=[21]-=-). Each point p in our original or primal space is mapped into a hyperplane g(p) in the dual space, and each hyperplane q in the primal space is mapped to a point g 0 (q) in the dual space. Note that ... |

693 |
Approximation algorithms for combinatorial problems
- Johnson
- 1974
(Show Context)
Citation Context ...ch ensures that F contains a subset of s hyperplanes that separate every +=\Gamma pair in the sample. The set covering problem, however, is NP-complete, and so we use a greedy approximation algorithm =-=[27, 30, 19]-=-, which gives an approximation ratio bound of O(lg m) for our application. The set of hyperplanes returned by the set covering algorithm partitions the space into O((s lg m) d ) regions. Since the alg... |

661 |
Queries and concept learning
- Angluin
- 1988
(Show Context)
Citation Context ...ons, especially when combined with the simplicity and robustness of the algorithms. One significant open problem is whether or not an algorithm for C d s exists in the query learning model of Angluin =-=[2]-=-. Since in this model the learner is required to achieve exact identification of the target concept (as opposed to the approximation achieved in the PAC model), the domain must be discretized. We curr... |

635 |
Learnability and the Vapnik-Chervonenkis Dimension
- Blumer, Ehrenfeucht, et al.
- 1989
(Show Context)
Citation Context ...arning geometric concepts in the PAC model. To illustrate the key technique used in most of this work consider the problem of learning unions of s halfspaces in d-dimensional space for any constant d =-=[13, 7, 14]-=- 1 . The standard Occam algorithm draws a sufficiently large sample S of m points (where m is chosen to satisfy the bound of Blumer et al. [13]) and then finds a hypothesis consistent with the sample ... |

297 | Efficient noise-tolerant learning from statistical queries
- Kearns
- 1993
(Show Context)
Citation Context ...n of our algorithm that can tolerate random noise in the labels for any noise rate strictly less than 1/2. This variation takes advantage of the noise tolerance inherent in the statistical (SQ) model =-=[28]-=-. Finally we present a generalization of the standard ffl-net result of Haussler and Welzl [25] and apply it to give an alternative noise-tolerant algorithm for d = 2 based on geometric subdivisions. ... |

267 |
On the ratio of optimal integral and fractional covers
- Lovász
- 1975
(Show Context)
Citation Context ...ch ensures that F contains a subset of s hyperplanes that separate every +=\Gamma pair in the sample. The set covering problem, however, is NP-complete, and so we use a greedy approximation algorithm =-=[27, 30, 19]-=-, which gives an approximation ratio bound of O(lg m) for our application. The set of hyperplanes returned by the set covering algorithm partitions the space into O((s lg m) d ) regions. Since the alg... |

261 | Epsilon-nets and simplex range queries
- Haussler, Welzl
- 1987
(Show Context)
Citation Context ...sion of our algorithm that can tolerate random classification noise for any noise rate strictly less than 1/2. Finally we present a generalization of the standard ffl-net result of Haussler and Welzl =-=[25]-=- and apply it to give an alternative noise-tolerant algorithm for d = 2 based on geometric subdivisions. 1 Introduction We present an efficient algorithm for PAC-learning a broad class of geometric co... |

226 |
Learning from noisy examples
- Angluin, Laird
- 1988
(Show Context)
Citation Context ...ed correctly based on the target concept. In this work we also consider a variant of the PAC model in which the labeled examples that the learner receives are corrupted by random classification noise =-=[3]-=-. In this noise model, the each example is still drawn at random from D. However, with probability j (where 0sj ! 1=2 is called the noise rate), the learner receives the incorrect label. And with prob... |

193 |
A General Lower Bound on the Number of Examples Needed for Learning
- Elrenfeucht, Haussler, et al.
- 1988
(Show Context)
Citation Context ...oncept f 2 H consistent with a sample of size max \Gamma 4 ffl log 2 ffi ; 8d ffl log 13 ffl \Delta will have error at most ffl with probability at least 1 \Gamma ffi. Furthermore, Ehrenfeucht et al. =-=[22]-=- prove that any concept class C of VC dimension d must use\Omega \Gamma 1 ffl log 1 ffi + d ffl \Delta examples in the worst case. One drawback with the above approach is that the hypothesis must be d... |

161 | Almost optimal set covers in finite VC-dimension
- Bronnimann, Goodrich
- 1995
(Show Context)
Citation Context ...s in constant dimensional space. Blumer et al. [13] give a similar result. Both algorithms return hypotheses containing O(s lg m) halfspaces where m is the size of the sample. Bronnimann and Goodrich =-=[14]-=- present a set covering algorithm that allows them to return a hypothesis containing O(ds lg(ds)) halfspaces. Baum gives efficient algorithms for learning several classes with infinite VC-dimension (s... |

152 |
A greedy heuristic for the set covering problem
- Chvatal
- 1979
(Show Context)
Citation Context ...r of examples from P . Replace h by h[f and remove from P the points correctly classified by f . Since the target concept gives a covering of size s, it follows that the greedy set covering algorithm =-=[19]-=- produces a cover of size O(s ln jP j) = O(s lg m). To see a fundamental limitation of the standard technique, consider the geometric concept shown in Figure 1. Notice that there is no hyperplane that... |

117 |
Learning disjunction of conjunctions
- Valiant
- 1985
(Show Context)
Citation Context ...he incorrect label. And with probability 1 \Gamma j, the learner receives the correct label. Thus the example drawn is labeled incorrectly, at random, with probability j. In the malicious noise model =-=[32]-=-, with probability j the adversary can provide an example and label of its choice. To obtain a noise-tolerant version of our algorithm we use the statistical query model [28, 20, 4, 5]. In this model,... |

69 |
Neural net algorithms that learn in polynomial time from
- Baum
- 1991
(Show Context)
Citation Context ...large sample and constructing a hypothesis that consists of at most s(2d) s boxes consistent with the sample. Finally, under a variation of the PAC model in which membership queries can be made, Baum =-=[9]-=- gives an algorithm that PAC-learns the union of s halfspaces in ! n in time polynomial in s, n, and a parameter that characterizes the number of bits of accuracy with which the target hyperplanes are... |

69 | Learning decision trees from random examples - Ehrenfeucht, Haussler - 1989 |

47 | Types of noise in data for concept learning - Sloan - 1988 |

45 | General bounds on statistical query learning and PAC learning with noise via hypothesis boosting
- Aslam, Decatur
- 1998
(Show Context)
Citation Context ...the malicious noise model [32], with probability j the adversary can provide an example and label of its choice. To obtain a noise-tolerant version of our algorithm we use the statistical query model =-=[28, 20, 4, 5]-=-. In this model, rather than sampling labeled examples, the learner requests the value of various statistics on the distribution from an oracle. A statistical query oracle returns the probability, wit... |

42 |
Polygon retrieval
- Willard
- 1982
(Show Context)
Citation Context ... the space in the following manner. Given that the points are in general position (namely, no three points are collinear), there exist two lines that divides the points into four groups of equal size =-=[34]-=-. We build the subdivision by finding such a pair of lines, and then we recursively apply the same technique (independently) to each of the four regions obtained until there are only q points in each ... |

40 | Statistical queries and faulty PAC Oracles, in
- Decatur
- 1993
(Show Context)
Citation Context ...the malicious noise model [32], with probability j the adversary can provide an example and label of its choice. To obtain a noise-tolerant version of our algorithm we use the statistical query model =-=[28, 20, 4, 5]-=-. In this model, rather than sampling labeled examples, the learner requests the value of various statistics on the distribution from an oracle. A statistical query oracle returns the probability, wit... |

36 |
Efficient NC algorithms for set cover with applications to learning and geometry
- Berger, Romped, et al.
- 1989
(Show Context)
Citation Context ...us, with the exception of some subclasses of C d s with polynomially sized concepts, our algorithm runs in time polynomial in the size of the target. Note, we can use parallel set covering techniques =-=[10]-=- to get our algorithm to run efficiently in parallel. A second contribution of our work is the conversion of our basic algorithm to a statistical query algorithm giving noise tolerance. Due to our cov... |

34 | Specification and simulation of statistical query algorithms for efficiency and noise tolerance
- Aslam, Decatur
- 1998
(Show Context)
Citation Context ...the malicious noise model [32], with probability j the adversary can provide an example and label of its choice. To obtain a noise-tolerant version of our algorithm we use the statistical query model =-=[28, 20, 4, 5]-=-. In this model, rather than sampling labeled examples, the learner requests the value of various statistics on the distribution from an oracle. A statistical query oracle returns the probability, wit... |

33 |
Training a 3-node neural net is NP-Complete
- Blum, Rivest
- 1989
(Show Context)
Citation Context ...Work In this section we highlight some of the relevant learning results for geometric concepts. There have been many results for the classes of unions and intersections of halfspaces. Blum and Rivest =-=[11]-=- show that there does not exist an efficient proper 4 learning algorithm for unions of s halfspaces, unless RP = NP . They also give an algorithm to PAC-learn the xor of two halfspaces by transforming... |

28 | Exact learning of discretized geometric concepts
- Bshouty, Goldberg, et al.
- 1998
(Show Context)
Citation Context ...here has been substantial work on exactly learning using equivalence queries (and in some cases also membership queries) unions of boxes [18, 24] and other more complex discretized geometric concepts =-=[15, 16]-=-. While most such work assumes that there are a constant number of dimensions, recently Auer, Kwek, Maass and Warmuth [6] give a polynomial time algorithm that is robust against noise to learn the cla... |

27 | Composite geometric concepts and polynomial predictability
- Long, Warmuth
- 1990
(Show Context)
Citation Context ...ently large sample of size m = poly \Gamma 1 ffl ; lg 1 ffi ; s; d \Delta , and then performing a greedy covering over the at most \Gamma em 2d \Delta 2d boxes defined by the sample. Long and Warmuth =-=[29]-=- present an algorithm to PAC-learn this same class by again drawing a sufficiently large sample and constructing a hypothesis that consists of at most s(2d) s boxes consistent with the sample. Finally... |

23 |
Generalizing the PAC model: Sample size bounds from metric dimension-based uniform convergence results
- Haussler
- 1989
(Show Context)
Citation Context ... containing O(ds lg(ds)) halfspaces. Baum gives efficient algorithms for learning several classes with infinite VC-dimension (such as convex polyhedral sets) under uniform distributions [8]. Haussler =-=[26]-=- also gives distribution specific algorithms for several classes of functions. Bshouty, Goldman and Mathias [17] have given noise-tolerant algorithms for several geometric classes. In particular, they... |

22 |
The perceptron algorithm is fast for nonmalicious distributions
- Baum
- 1997
(Show Context)
Citation Context ...n a hypothesis containing O(ds lg(ds)) halfspaces. Baum gives efficient algorithms for learning several classes with infinite VC-dimension (such as convex polyhedral sets) under uniform distributions =-=[8]-=-. Haussler [26] also gives distribution specific algorithms for several classes of functions. Bshouty, Goldman and Mathias [17] have given noise-tolerant algorithms for several geometric classes. In p... |

22 | Learning from a Consistently Ignorant Teacher
- Frazier, Goldman, et al.
- 1994
(Show Context)
Citation Context ...ns the union of s halfspaces in ! n in time polynomial in s, n, and a parameter that characterizes the number of bits of accuracy with which the target hyperplanes are specified. Also, Frazier et al. =-=[23]-=- have given an algorithm to PAC-learn the s-fold union of boxes in E d for which each box is entirely contained within the positive quadrant and contains the origin. Their algorithm learns this subcla... |

21 |
On Learning a Union of Half Spaces
- Baum
- 1990
(Show Context)
Citation Context ...arning geometric concepts in the PAC model. To illustrate the key technique used in most of this work consider the problem of learning unions of s halfspaces in d-dimensional space for any constant d =-=[13, 7, 14]-=- 1 . The standard Occam algorithm draws a sufficiently large sample S of m points (where m is chosen to satisfy the bound of Blumer et al. [13]) and then finds a hypothesis consistent with the sample ... |

21 | PAC learning intersections of halfspaces with membership queries - Kwek, Pitt - 1998 |

18 | The bounded injury priority method and the learnability of unions of rectangles
- Chen, Homer
- 1994
(Show Context)
Citation Context ...ithm is proper if all hypotheses come from the concept class. There has been substantial work on exactly learning using equivalence queries (and in some cases also membership queries) unions of boxes =-=[18, 24]-=- and other more complex discretized geometric concepts [15, 16]. While most such work assumes that there are a constant number of dimensions, recently Auer, Kwek, Maass and Warmuth [6] give a polynomi... |

9 | Noise-Tolerant Parallel Learning of Geometric Concepts
- Bshouty, Goldman, et al.
- 1995
(Show Context)
Citation Context ...VC-dimension (such as convex polyhedral sets) under uniform distributions [8]. Haussler [26] also gives distribution specific algorithms for several classes of functions. Bshouty, Goldman and Mathias =-=[17]-=- have given noise-tolerant algorithms for several geometric classes. In particular, they studied C d s against the product distribution and a restricted version of this class, in which the hyperplanes... |

9 | Learning of depth two neural networks with constant fan-in at the hidden nodes (extended abstract - Auer, Kwek, et al. - 1996 |

7 |
Learning union of rectangles with membership and equivalent queries
- Goldberg, Goldman, et al.
- 1994
(Show Context)
Citation Context ...ithm is proper if all hypotheses come from the concept class. There has been substantial work on exactly learning using equivalence queries (and in some cases also membership queries) unions of boxes =-=[18, 24]-=- and other more complex discretized geometric concepts [15, 16]. While most such work assumes that there are a constant number of dimensions, recently Auer, Kwek, Maass and Warmuth [6] give a polynomi... |

6 |
Exact learning of discretized concepts
- Bshouty, Goldberg, et al.
- 1994
(Show Context)
Citation Context ...here has been substantial work on exactly learning using equivalence queries (and in some cases also membership queries) unions of boxes [18, 24] and other more complex discretized geometric concepts =-=[15, 16]-=-. While most such work assumes that there are a constant number of dimensions, recently Auer, Kwek, Maass and Warmuth [6] give a polynomial time algorithm that is robust against noise to learn the cla... |

3 |
On-line prediction of depth two linear threshold circuits. Unpublished manuscript
- Auer, Kwek, et al.
- 1995
(Show Context)
Citation Context ...s of boxes [18, 24] and other more complex discretized geometric concepts [15, 16]. While most such work assumes that there are a constant number of dimensions, recently Auer, Kwek, Maass and Warmuth =-=[6]-=- give a polynomial time algorithm that is robust against noise to learn the class of depth two linear threshold circuits with a polynomial number of variables given that the input gates have constant ... |

3 | On the ratio of optimal integral and fractional covers - asz - 1975 |

1 | Almost optimal set covers in finite VC-dimenison - onnimann, H, et al. - 1994 |