## Specification and Simulation of Statistical Query Algorithms for Efficiency and Noise Tolerance (1995)

### Cached

### Download Links

- [www.cs.dartmouth.edu]
- [www.ccs.neu.edu]
- [theory.lcs.mit.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Journal of Computer and System Sciences |

Citations: | 33 - 2 self |

### BibTeX

@ARTICLE{Aslam95specificationand,

author = {Javed A. Aslam and Scott E. Decatur},

title = {Specification and Simulation of Statistical Query Algorithms for Efficiency and Noise Tolerance},

journal = {Journal of Computer and System Sciences},

year = {1995},

volume = {56},

pages = {437--446}

}

### Years of Citing Articles

### OpenURL

### Abstract

A recent innovation in computational learning theory is the statistical query (SQ) model. The advantage of specifying learning algorithms in this model is that SQ algorithms can be simulated in the PAC model, both in the absence and in the presence of noise. However, simulations of SQ algorithms in the PAC model have non-optimal time and sample complexities. In this paper, we introduce a new method for specifying statistical query algorithms based on a type of relative error and provide simulations in the noise-free and noise-tolerant PAC models which yield more efficient algorithms. Requests for estimates of statistics in this new model take the form: "Return an estimate of the statistic within a 1 \Sigma factor, or return `?', promising that the statistic is less than `." In addition to showing that this is a very natural language for specifying learning algorithms, we also show that this new specification is polynomially equivalent to standard SQ, and thus, known learnability and hardness results for statistical query learning are preserved. We then give highly efficient PAC simulations of relative error SQ algorithms. We show that the learning algorithms obtained by simulating efficient relative error SQ algorithms in both the absence of noise and in the presence of malicious noise have roughly optimal sample complexity. We also show that the simulation of efficient relative error SQ algorithms in the presence of classification noise yield learning algorithms at least as efficient as those obtained through standard methods, and in some cases improved, roughly optimal results are achieved. The sample complexities for all of these simulations are based on the d metric which is a type of relative error metric useful for quantities which are small or even zero. We sho...

### Citations

1694 | A Theory of the Learnable
- Valiant
- 1984
(Show Context)
Citation Context ...and the simulations described above. 2 1 Introduction In this paper, we focus on the development of efficient, fault-tolerant algorithms for learning in the probably approximately correct (PAC) model =-=[19]-=-. An algorithm for PAC learning a class of functions (concepts) uses examples drawn from an oracle (the environment) in order to approximate a hidden target function selected from the class. The examp... |

665 | The strength of weak learnability
- Schapire
- 1990
(Show Context)
Citation Context ...elative Error Statistical Query Learning In this section we show general upper bounds on the complexity of relative error statistical query learning. We do so by applying accuracy boosting techniques =-=[10, 11, 18]-=- and specifically, these techniques as applied in the statistical query model [2]. Theorem 10 If a concept class F is SQ learnable by an algorithm A using hypothesis class H, then F is SQ learnable wi... |

624 |
Learnability and the vapnik-chervonenkis dimension
- Blumer, Ehrenfeucht, et al.
- 1989
(Show Context)
Citation Context ...ega\Gammae =" 2 ) 3 examples. This is clearly suboptimal when compared to the basically tight general upper and lower bounds on the noise-free sample complexity whose dependence on " is ~ \T=-=heta(1=") [5, 9]-=-. 1 Thus, while there is an incentive for developing algorithms in the statistical query model due to the noise tolerance gained, there is also a disincentive towards doing so due to the inefficiency ... |

423 | Boosting a weak learning algorithm by majority
- Freund
- 1995
(Show Context)
Citation Context ...elative Error Statistical Query Learning In this section we show general upper bounds on the complexity of relative error statistical query learning. We do so by applying accuracy boosting techniques =-=[10, 11, 18]-=- and specifically, these techniques as applied in the statistical query model [2]. Theorem 10 If a concept class F is SQ learnable by an algorithm A using hypothesis class H, then F is SQ learnable wi... |

372 |
Decision theoretic generalizations of the PAC model for neural net and other learning applications
- Haussler
- 1992
(Show Context)
Citation Context ...se, we determine the complexity of the PAC algorithm as a function of the complexity of the SQ algorithm. In order to prove sample complexity bounds for these simulations, we make use of the dsmetric =-=[17, 13]-=-. Haussler gives sample complexity bounds sufficient for uniform convergence, with respect to the dsmetric, of probabilities and their estimates based on a sample. We relate this uniform convergence t... |

288 | Efficient noise-tolerant learning from statistical queries
- Kearns
- 1993
(Show Context)
Citation Context ...proximate the underlying target function with respect to noise-free examples. A recently developed tool for creating efficient, noise-tolerant, learning algorithms is the statistical query (SQ) model =-=[14]-=-. In this model, instead of using labelled examples, the algorithm asks for the estimates of the values of statistics defined over the distribution of labelled examples. This model may be viewed as a ... |

256 |
Fast Probabilistic Algorithms for Hamiltonian Circuits and Matchings
- Angluin, Valiant
- 1979
(Show Context)
Citation Context ...rupted sample which satisfy . The sample size m is chosen sufficiently large to ensure, with high probability, that fis2fi and that for alls2 Q, d `s=8 ( Ps; Ps)s=8. Applying standard Chernoff bounds =-=[1]-=-, the former condition can be guaranteed with probability at least 1 \Gamma ffi=2 using a sample of size 72s`sln 2 ffi . Applying Theorem 2, the latter condition can be guaranteed with probability at ... |

223 |
Quantifying inductive bias: AI learning algorithms and Valiantâ€™s learning framework
- Haussler
- 1988
(Show Context)
Citation Context ...rithm which tolerates this malicious error by simulating an efficient relative error SQ algorithm for the problem. The SQ algorithm uses the set cover approach for learning conjunctions of k literals =-=[12]-=-. The additive error SQ version of this algorithm is a conversion from elimination and covering examples to elimination and 15 1. Learn-k-Conjunction: 2. V := fx 1 ; : : : ; x n ; x 1 ; : : : ; x n g ... |

191 |
A general lower bound on the number of examples needed for learning
- Ehrenfeucht, Haussler, et al.
- 1989
(Show Context)
Citation Context ...ega\Gammae =" 2 ) 3 examples. This is clearly suboptimal when compared to the basically tight general upper and lower bounds on the noise-free sample complexity whose dependence on " is ~ \T=-=heta(1=") [5, 9]-=-. 1 Thus, while there is an incentive for developing algorithms in the statistical query model due to the noise tolerance gained, there is also a disincentive towards doing so due to the inefficiency ... |

167 | Learning in the presence of malicious errors
- Kearns, Li
- 1993
(Show Context)
Citation Context ... the use of additive error statistical queries yielded an additional factor of 1=". In Corollary 3, these bounds are within logarithmic factors of both the O(") maximum allowable malicious e=-=rror rate [15] and -=-the \Omega\Gammae =") lower bound on the sample complexity for noise-free PAC learning [9]. We also note that in this malicious-error tolerant PAC simulation, the sample, time, space and hypothes... |

118 | Weakly learning DNF and characterizing statistical query learning using Fourier analysis
- Blum, Furst, et al.
- 1994
(Show Context)
Citation Context ... and "soft-Theta," is convenient for expressing bounds while ignoring lower order factors. Note that it is somewhat different than the standard soft-order notation. 4 hardness results of Blu=-=m, et al. [4]-=- based on Fourier analysis. The advantages of the new model are then demonstrated by the simulations of relative error SQ algorithms in the noise-free PAC model, the malicious error PAC model and the ... |

50 |
An improved boosting algorithm and its implications on learning complexity
- Freund
- 1992
(Show Context)
Citation Context ...elative Error Statistical Query Learning In this section we show general upper bounds on the complexity of relative error statistical query learning. We do so by applying accuracy boosting techniques =-=[10, 11, 18]-=- and specifically, these techniques as applied in the statistical query model [2]. Theorem 10 If a concept class F is SQ learnable by an algorithm A using hypothesis class H, then F is SQ learnable wi... |

45 | General bounds on statistical query learning and PAC learning with noise via hypothesis boosting
- Aslam, Decatur
- 1998
(Show Context)
Citation Context ...he complexity of relative error statistical query learning. We do so by applying accuracy boosting techniques [10, 11, 18] and specifically, these techniques as applied in the statistical query model =-=[2]. Theorem 10 If-=- a concept class F is SQ learnable by an algorithm A using hypothesis class H, then F is SQ learnable with O(N 0 log 2 (1=")) queries each with threshold\Omega\Gammas0 ` 0 "= log(1=")) ... |

40 | Statistical queries and faulty PAC Oracles, in
- Decatur
- 1993
(Show Context)
Citation Context ...les. Specifically, an SQ algorithm can be simulated in the PAC model in the presence of classification noise, malicious errors, attribute noise and even hybrid models combining these different noises =-=[14, 6, 8, 7]-=-. A key parameter in the complexity of the PAC algorithm generated by the simulation of SQ algorithms is the tolerance of the SQ algorithm,s, which quantifies the largest additive error that the SQ al... |

20 |
Learning from Good and Bad Data. Kluwer international series in engineering and computer science
- Laird
- 1988
(Show Context)
Citation Context ...esting (required to determine which noise rate guess was correct) which uses fewer examples than the standard technique. This improved hypothesis testing is achieved by generalizing a result of Laird =-=[16]. We begin-=- by giving a new decomposition of Psinto quantities that may be "guessed" or estimated using the classification noise oracle. Let , \Phi and j be the standard Boolean operators for AND, excl... |

19 | On learning from noisy and incomplete examples
- Decatur, Gennaro
- 1995
(Show Context)
Citation Context ...les. Specifically, an SQ algorithm can be simulated in the PAC model in the presence of classification noise, malicious errors, attribute noise and even hybrid models combining these different noises =-=[14, 6, 8, 7]-=-. A key parameter in the complexity of the PAC algorithm generated by the simulation of SQ algorithms is the tolerance of the SQ algorithm,s, which quantifies the largest additive error that the SQ al... |

17 |
Rates of uniform almost-sure convergence for empirical processes indexed by unbounded classes of functions
- Pollard
- 1987
(Show Context)
Citation Context ...se, we determine the complexity of the PAC algorithm as a function of the complexity of the SQ algorithm. In order to prove sample complexity bounds for these simulations, we make use of the dsmetric =-=[17, 13]-=-. Haussler gives sample complexity bounds sufficient for uniform convergence, with respect to the dsmetric, of probabilities and their estimates based on a sample. We relate this uniform convergence t... |

12 | On the sample complexity of noise-tolerant learning
- Aslam, Decatur
- 1996
(Show Context)
Citation Context ...ndard additive error SQ algorithm uses ~ O( n 2 " 2 (1\Gamma2j b ) 2 log 1 ffi ) examples. Note that our sample complexity roughly matches the lower bound of \Omega\Gamma n "(1\Gamma2j) 2 lo=-=g 1 ffi ) [3]. This is -=-the first non-trivial (i.e., superpolynomial sized) class to be shown learnable in the presence of classification noise using a sample complexity whose dependence on " is o(1=" 2 ). The rela... |

8 | Learning in hybrid noise environments using statistical queries
- Decatur
- 1996
(Show Context)
Citation Context ...les. Specifically, an SQ algorithm can be simulated in the PAC model in the presence of classification noise, malicious errors, attribute noise and even hybrid models combining these different noises =-=[14, 6, 8, 7]-=-. A key parameter in the complexity of the PAC algorithm generated by the simulation of SQ algorithms is the tolerance of the SQ algorithm,s, which quantifies the largest additive error that the SQ al... |