Results 1 
3 of
3
Learning Simple Concepts Under Simple Distributions
 SIAM JOURNAL OF COMPUTING
, 1991
"... We aim at developing a learning theory where `simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to le ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
We aim at developing a learning theory where `simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Relatively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to learn are usually learnable. To model the intuitive notion of learning more closely, we do not require that the learning algorithm learns (polynomially) under all distributions, but only under all simple distributions. A distribution is simple if it is dominated by an enumerable distrib...
DNA Sequencing and String Learning
"... In laboratories, the majority of largescale DNA sequencing is done following the shotgun strategy, which is to randomly sequence large amount of relatively short fragments and then heuristically find a shortest common superstring of the fragments [26]. We study mathematical frameworks, under plau ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
In laboratories, the majority of largescale DNA sequencing is done following the shotgun strategy, which is to randomly sequence large amount of relatively short fragments and then heuristically find a shortest common superstring of the fragments [26]. We study mathematical frameworks, under plausible assumptions, suitable for massive automated DNA sequencing and for analyzing DNA sequencing algorithms. We model the DNA sequencing problem as learning a string from its randomly drawn substrings. Under certain restrictions, this may be viewed as string learning in Valiant's distributionfree learning model and in this case we give an efficient learning algorithm and a quantitative bound on how many examples suffice. One major obstacle to our approach turns out to be a quite wellknown open question on how to approximate a shortest common superstring of a set of strings, raised by a number of authors in the last ten years [9, 29, 30]. We give the first provably good algorithm which approximates a shortest superstring of length n by a superstring of length O(n log n). The algorithm works equally well even in the presence of negative examples, i.e., when merging of some strings is prohibited.
Psufficient statistics for PAC learning ktermDNF formulas through enumeration.
"... Working in the framework of PAClearning theory, we present special statistics for accomplishing in polynomial time proper learning of DNF boolean formulas having a fixed number of monomials. Our statistics turn out to be near sufficient for a large family of distribution lawsthat we call butter ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Working in the framework of PAClearning theory, we present special statistics for accomplishing in polynomial time proper learning of DNF boolean formulas having a fixed number of monomials. Our statistics turn out to be near sufficient for a large family of distribution lawsthat we call butterfly distributions. We develop a theory of most powerful learning for analyzing the performance of learning algorithms, with particular reference to tradeoffs between power and computational costs. Focusing attention on sample and time complexity, we prove that our algorithm works as efficiently as the best algorithms existing in the literature  while the latter only take care of subclasses of our family of distributions. Keywords. Computational complexity, boolean function, concept learning, learning by examples, sufficient statistics, nonparametric statistics. Abbreviated title. Psufficient statistics for learning kterm DNF. 1. INTRODUCTION Intuitively, learning in the PAC (Probably ...