Results 1 -
3 of
3
Learning Simple Concepts Under Simple Distributions
- SIAM JOURNAL OF COMPUTING
, 1991
"... We aim at developing a learning theory where `simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to le ..."
Abstract
-
Cited by 52 (3 self)
- Add to MetaCart
We aim at developing a learning theory where `simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to learn are usually learnable. To model the intuitive notion of learning more closely, we do not require that the learning algorithm learns (polynomially) under all distributions, but only under all simple distributions. A distribution is simple if it is dominated by an enumerable distrib...
DNA Sequencing and String Learning
"... In laboratories, the majority of large-scale DNA sequencing is done following the shot-gun strategy, which is to randomly sequence large amount of relatively short fragments and then heuristically find a shortest common superstring of the fragments [26]. We study mathematical frameworks, under plau ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
In laboratories, the majority of large-scale DNA sequencing is done following the shot-gun strategy, which is to randomly sequence large amount of relatively short fragments and then heuristically find a shortest common superstring of the fragments [26]. We study mathematical frameworks, under plausible assumptions, suitable for massive automated DNA sequencing and for analyzing DNA sequencing algorithms. We model the DNA sequencing problem as learning a string from its randomly drawn substrings. Under certain restrictions, this may be viewed as string learning in Valiant's distribution-free learning model and in this case we give an efficient learning algorithm and a quantitative bound on how many examples suffice. One major obstacle to our approach turns out to be a quite well-known open question on how to approximate a shortest common superstring of a set of strings, raised by a number of authors in the last ten years [9, 29, 30]. We give the first provably good algorithm which approximates a shortest superstring of length n by a superstring of length O(n log n). The algorithm works equally well even in the presence of negative examples, i.e., when merging of some strings is prohibited.
P-sufficient statistics for PAC learning k-term-DNF formulas through enumeration.
"... Working in the framework of PAC-learning theory, we present special statistics for accomplishing in polynomial time proper learning of DNF boolean formulas having a fixed number of monomials. Our statistics turn out to be near sufficient for a large family of distribution laws---that we call butter ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Working in the framework of PAC-learning theory, we present special statistics for accomplishing in polynomial time proper learning of DNF boolean formulas having a fixed number of monomials. Our statistics turn out to be near sufficient for a large family of distribution laws---that we call butterfly distributions. We develop a theory of most powerful learning for analyzing the performance of learning algorithms, with particular reference to trade-offs between power and computational costs. Focusing attention on sample and time complexity, we prove that our algorithm works as efficiently as the best algorithms existing in the literature --- while the latter only take care of subclasses of our family of distributions. Keywords. Computational complexity, boolean function, concept learning, learning by examples, sufficient statistics, nonparametric statistics. Abbreviated title. P-sufficient statistics for learning k-term DNF. 1. INTRODUCTION Intuitively, learning in the PAC (Probably ...

