Results 1 -
6 of
6
Learning in Hybrid Noise Environments Using Statistical Queries
- Learning from Data: Artificial Intelligence and Statistics V
, 1995
"... We consider formal models of learning from noisy data. Specifically, we focus on learning in the probability approximately correct model as defined by Valiant. Two of the most widely studied models of noise in this setting have been classification noise and malicious errors. However, a more realist ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We consider formal models of learning from noisy data. Specifically, we focus on learning in the probability approximately correct model as defined by Valiant. Two of the most widely studied models of noise in this setting have been classification noise and malicious errors. However, a more realistic model combining the two types of noise has not been formalized. We define a learning environment based on a natural combination of these two noise models. We first show that hypothesis testing is possible in this model. We next describe a simple technique for learning in this model, and then describe a more powerful technique based on statistical query learning. We show that the noise tolerance of this improved technique is roughly optimal with respect to the desired learning accuracy and that it provides a smooth tradeoff between the tolerable amounts of the two types of noise. Finally, we show that statistical query simulation yields learning algorithms for other combinations of noise m...
On Learning Correlated Boolean Functions Using Statistical Query
"... In this paper, we study the problem of using statistical query (SQ) to learn highly correlated boolean functions, namely, a class of functions where any pair agree on significantly more than a fraction 1/2 of the inputs. We give a limit on how well one can approximate all the functions without makin ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we study the problem of using statistical query (SQ) to learn highly correlated boolean functions, namely, a class of functions where any pair agree on significantly more than a fraction 1/2 of the inputs. We give a limit on how well one can approximate all the functions without making any query, and then we show that beyond this limit, the number of statistical queries the algorithm has to make increases with the "extra" advantage the algorithm gains in learning the functions. Here the advantage is defined to be the probability the algorithm agrees with the target function minus the probability the algorithm doesn't agree. An interesting consequence of our results is that the class of booleanized linear functions over a finite field (f( ~a (~x) = 1 iff OE(~a \Delta ~x) = 1, where OE : GFp 7! f\Gamma1; 1g is an arbitrary boolean function the maps any elements in GFp to \Sigma1). This result is useful since the hardness of learning booleanized linear functions over a finite field is related to the security of certain cryptosystems ([B01]). In particular, we prove that the class of linear threshold functions over a finite field (f( ~a;b (~x) = 1 iff ~a \Delta ~x b) cannot be learned efficiently using statistical query. This contrasts with Blum et al.'s result [BFK+96] that linear threshold functions over reals (perceptrons) are learnable using SQ model. Finally, we describe a PAC-learning algorithm that learns a class of linear threshold functions in time that is provably impossible for statistical query algorithms to learn the class. With properly chosen parameters, this class of linear threshold functions can become an example of PAC-learnable, but not SQ-learnable functions that are not parity functions.
Learning with noise. Extension to regression.
, 2001
"... Contents 1 Introduction 1 2 State of the art 2 2.1 Constant noise rate . . . . . . . . . . . . 2 2.2 Malicious errors . . . . . . . . . . . . . . 2 3 Which model of noise ? 3 3.1 A new model of noise . . . . . . . . . . 3 3.2 Discussion . . . . . . . . . . . . . . . . . 3 4 Noise tolerant algorithm ..."
Abstract
- Add to MetaCart
Contents 1 Introduction 1 2 State of the art 2 2.1 Constant noise rate . . . . . . . . . . . . 2 2.2 Malicious errors . . . . . . . . . . . . . . 2 3 Which model of noise ? 3 3.1 A new model of noise . . . . . . . . . . 3 3.2 Discussion . . . . . . . . . . . . . . . . . 3 4 Noise tolerant algorithm - discussion about margin-based algorithms 4 4.1 Algorithm for robust classication noise 4 4.2 Margin methods for malicious errors . . 4 4.3 -insensitivity in classication . . . . . . 5 5 Extension to regression 5 5.1 Extension of the algorithm of Kearns and Li . . . . . . . . . . . . . . . . . . . 5 5.2 Direct derivation of a learning algorithm: -insensitivity . . . . . . . . . . . 5 5.2.1 Bounded noise . . . . . . . . . . 5 5.2.2 Outliers . . . . . . . . . . . . . . 5 5.3 Minimizing quantiles . . . . . . . . . . . 5 6 Conclusions 6 1 Introduction Extended version of: Learning
Contribution of Statistical Learning to Unsupervised Learning.
"... Uniform non-asymptotic statistics have been widely used in the area of learning theory. In this paper, we study another possible area of applications: unsupervised learning, where the requirement of simple representations has the advantage of leading to small values of the VC-dimension. ..."
Abstract
- Add to MetaCart
Uniform non-asymptotic statistics have been widely used in the area of learning theory. In this paper, we study another possible area of applications: unsupervised learning, where the requirement of simple representations has the advantage of leading to small values of the VC-dimension.
Learning non-independent sequences of examples. System identification, control and stabilization
"... Very extended version of papers published in proceedings of EFTF 2001 and ICNF 2001 (J.M. Friedt/O. Teytaud/M. Planat/D. Gillet). Many recent works consider practical applications of neural networks for control. A few papers only have been devoted to the application of the theoretical part of learn ..."
Abstract
- Add to MetaCart
Very extended version of papers published in proceedings of EFTF 2001 and ICNF 2001 (J.M. Friedt/O. Teytaud/M. Planat/D. Gillet). Many recent works consider practical applications of neural networks for control. A few papers only have been devoted to the application of the theoretical part of learning to control (their main results are recalled here). This paper provides: Notations and denitions for system identication, stabilization and control. An as extensive as possible survey of results about ergodic, stationary or chaotic time series (learning theory with temporal dependencies). Historical introductions to areas of science which intersect system identi cation, stabilization or control: statistical physics, stochastic dynamics of deterministic systems, empirical process and VC-theory, fuzzy logic, neural networks and related learning tools, Markov models. Practical illustrations and classical benchmarks, with references for practical algorithms. Theoretical open problems in system identication, stabilization and control.
E1 Reconceiving Machine Learning E2 Aims and Background
"... Beware of the man of one method or one instrument, either experimental or theoretical. He tends to become method oriented rather than problem oriented. The method-oriented man is shackled: the problem-oriented man is at least reaching freely toward what is most important. 52 Context Machine Learning ..."
Abstract
- Add to MetaCart
Beware of the man of one method or one instrument, either experimental or theoretical. He tends to become method oriented rather than problem oriented. The method-oriented man is shackled: the problem-oriented man is at least reaching freely toward what is most important. 52 Context Machine Learning is a sub-discipline of Information and Communication Technology (ICT) that develops the technologies for machines to recognise and learn patterns in data. It is distinct from, although related to, statistics. It can be differentiated by its focus on creating technology rather than the human-centred analysis of data. It is the science and engineering behind Data Mining. Machine learning is pervasive: it plays a key role in all stages of the scientific process and across diverse fields including bioinformatics, engineering and finance. It is widely accepted that ICT plays an enabling role across almost all technological disciplines. Analogously, Machine Learning plays an enabling role across most parts of ICT, from embedded to enterprise systems, and consequently is a crucial enabler of the Digital Economy 16. Vast quantities of data are now routinely collected and stored because it is affordable to do so. Machine learning makes sense of this data flood. The Problem The massive reduction in the cost of collecting, storing, transporting and processing

