Results 1  10
of
169
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 667 (23 self)
 Add to MetaCart
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
Computational Limitations on Learning from Examples
 Journal of the ACM
, 1988
"... Abstract. The computational complexity of learning Boolean concepts from examples is investigated. It is shown for various classes of concept representations that these cannot be learned feasibly in a distributionfree sense unless R = NP. These classes include (a) disjunctions of two monomials, (b) ..."
Abstract

Cited by 192 (10 self)
 Add to MetaCart
Abstract. The computational complexity of learning Boolean concepts from examples is investigated. It is shown for various classes of concept representations that these cannot be learned feasibly in a distributionfree sense unless R = NP. These classes include (a) disjunctions of two monomials, (b) Boolean threshold functions, and (c) Boolean formulas in which each variable occurs at most once. Relationships between learning of heuristics and finding approximate solutions to NPhard optimization problems are given. Categories and Subject Descriptors: F. 1.1 [Computation by Abstract Devices]: Models of Computationrelations among models; F. 1.2 [Computation by Abstract Devices]: Modes of Computationprobabilistic computation; F. 1.3 [Computation by Abstract Devices]: Complexity Classesreducibility and completeness; 1.2.6 [Artificial Intelligence]: Learningconcept learning; induction
Efficient Discovery of Conserved Patterns Using a Pattern Graph
 Comput. Appl. Biosci
, 1997
"... Motivation: We have previously reported an algorithm for discovering patterns conserved in sets of related unaligned protein sequences. The algorithm was implemented in a program called Pratt. Pratt allows the user to define a class of patterns (e.g. the degree of ambiguity allowed and the length an ..."
Abstract

Cited by 66 (8 self)
 Add to MetaCart
Motivation: We have previously reported an algorithm for discovering patterns conserved in sets of related unaligned protein sequences. The algorithm was implemented in a program called Pratt. Pratt allows the user to define a class of patterns (e.g. the degree of ambiguity allowed and the length and number of gaps), and is then guaranteed to find the consen>ed patterns in this class scoring highest according to a defined fitness measure. In many cases, this version of Pratt was very efficient, but in other cases it was too time consuming to be applied. Hence, a more efficient algorithm was needed. Results: In this paper, we describe a new and improved searching strategy that has two main advantages over the old strategy. First, it allows for easier integration with programs for multiple sequence alignment and data base search. Secondly, it makes it possible to use branchandbound search, and heuristics, to speed up the search. The new search strategy has been implemented in a new version of the Pratt program. Availability: The source code for the Pratt programs can be
A Guided Tour Across the Boundaries of Learning Recursive Languages
 Lecture Notes in Artificial Intelligence
, 1994
"... The present paper deals with the learnability of indexed families of uniformly recursive languages from positive data as well as from both, positive and negative data. We consider the influence of various monotonicity constraints to the learning process, and provide a thorough study concerning the i ..."
Abstract

Cited by 56 (29 self)
 Add to MetaCart
The present paper deals with the learnability of indexed families of uniformly recursive languages from positive data as well as from both, positive and negative data. We consider the influence of various monotonicity constraints to the learning process, and provide a thorough study concerning the influence of several parameters. In particular, we present examples pointing to typical problems and solutions in the field. Then we provide a unifying framework for learning. Furthermore, we survey results concerning learnability in dependence on the hypothesis space, and concerning order independence. Moreover, new results dealing with the efficiency of learning are provided. First, we investigate the power of iterative learning algorithms. The second measure of efficiency studied is the number of mind changes a learning algorithm is allowed to perform. In this setting we consider the problem whether or not the monotonicity constraints introduced do influence the efficiency of learning algo...
Incremental concept learning for bounded data mining
 Information and Computation
, 1999
"... Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning ma ..."
Abstract

Cited by 39 (29 self)
 Add to MetaCart
Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning machine computes a sequence of hypotheses about the target concept from a positive presentation of it. With iterative learning, the learning machine, in making a conjecture, has access to its previous conjecture and the latest data item coming in. In kbounded examplememory inference (k is a priori xed) the learner is allowed to access, in making a conjecture, its previous hypothesis, its memory of up to k data items it has already seen, and the next element coming in. In the case of kfeedback identi cation, the learning machine, in making a conjecture, has access to its previous conjecture, the latest data item coming in, and, on the basis of this information, it can compute k items and query the database of previous data to nd out, for each of the k items, whether or not it is in the database (k is again a priori xed). In all cases, the sequence of conjectures has to converge to a hypothesis
Polynomialtime Learning of Elementary Formal Systems
 Theoretical Computer Science
, 2000
"... An elementary formal system (EFS) is a logic program con sisting of definite clauses whose arguments have patterns instead of firstorder terms. We investigate EFSs for polynomialtime PAClearnability. A definite clause of an EFS is hereditary if every pattern in the body is a subword of a pat ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
An elementary formal system (EFS) is a logic program con sisting of definite clauses whose arguments have patterns instead of firstorder terms. We investigate EFSs for polynomialtime PAClearnability. A definite clause of an EFS is hereditary if every pattern in the body is a subword of a pattern in the head. With this new notion, we show that HEFS(ra, k, t, r) is polynomialtime learnable, which is the class of languages definable by EFSs consisting of at most ra hereditary definite clauses with predicate symbols of arity at most r, where k and t bound the number of variable occurrences in the head and the number of atoms in the body, respectively. The class defined by all finite unions of EFSs in HEFS(ra, k, t, r) is also polynomialtime learnable. We also show an interesting series of NClearnable classes of EFSs. As hardness results, the class of regular pattern languages is shown not polynomialtime learnable unless RP=NP. Furthermore, the related problem of deciding whether there is a common subsequence which is consistent with given positive and negative examples is shown NPcomplete.
Learning Acyclic Firstorder Horn Sentences From Entailment
, 1997
"... This paper consider the problem of learning an unknown firstorder Horn sentence H 3 from examples of Horn clauses that H 3 implies and does not imply. Particularly, we deal with a subclass of firstorder Horn sentences ACH(k), called acyclic constrained Horn programs of constant arity k. ACH(k) al ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
This paper consider the problem of learning an unknown firstorder Horn sentence H 3 from examples of Horn clauses that H 3 implies and does not imply. Particularly, we deal with a subclass of firstorder Horn sentences ACH(k), called acyclic constrained Horn programs of constant arity k. ACH(k) allow recursions, disjunctive definitions, and the use of function symbols. We present an algorithm that exactly identifies every target Horn program H 3 in ACH(k) in polynomial time in p; m and n using O(pmn k+1 ) entailment equivalence queries and O(pm 2 n 2k+1 ) request for a hint queries, where p is the number of predicates, m is the number of clauses contained in H 3 and n is the size of the longest counterexample. This algorithm combines saturation and least general generalization operators to invert resolution steps. Then, we show that request for hint queries can be replaced by entailment membership queries for a proper subclass of ACH(k). Using this method, we have a polynomi...
Nonstandard Concepts of Similarity in CaseBased Reasoning
 Information Systems and Data Analysis: Prospects  Foundations  Applications, Proceedings of the 17th Annual Conference of the GfKl, Univ. of Kaiserslautern, 1993, Studies in Classification, Data Analysis, and Knowledge Organization
, 1994
"... Introduction The present paper is aimed at propagating new concepts of similarity more flexible and expressive than those underlying most casebased reasoning approaches today. So, it mainly deals with criticizing approaches in use, with motivating and introducing new notions and notations, and wit ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
Introduction The present paper is aimed at propagating new concepts of similarity more flexible and expressive than those underlying most casebased reasoning approaches today. So, it mainly deals with criticizing approaches in use, with motivating and introducing new notions and notations, and with first steps towards future applications. The investigations at hand originate from the author's work in learning theory. In exploring the relationship between inductive learning and casebased learning within a quite formal setting (cf. [Jan92b]), it turned out that both areas almost coincide, if sufficiently flexible similarity concepts are taken into acount. This provides some formal arguments for the necessity of nonsymmetric similarity measures. Encouraged by these first results, the author tried to investigate more structured learning problems from the view point of casebased reasoning. It turned out that an appropriate handling requires formalisms allowing similarity concep
Finding Minimal Generalizations for Unions of Pattern Languages and Its Application to Inductive Inference from Positive Data.
 In Proc. the 11th STACS, LNCS 775
, 1994
"... A pattern is a string of constant symbols and variables. The language defined by a pattern p is the set of constant strings obtained from p by substituting nonempty constant strings for variables in p. In this paper we are concerning with polynomial time inference from positive data of the class of ..."
Abstract

Cited by 23 (12 self)
 Add to MetaCart
A pattern is a string of constant symbols and variables. The language defined by a pattern p is the set of constant strings obtained from p by substituting nonempty constant strings for variables in p. In this paper we are concerning with polynomial time inference from positive data of the class of unions of a bounded number of pattern languages. We introduce a syntactic notion of minimal multiple generalizations (mmg for short) to study the inferability of classes of unions. If a pattern p is obtained from another pattern q by substituting nonempty patterns for variables in q, q is said to be more general than p. A set of patterns defines a union of their languages. A set Q of patterns is said to be more general than a set P of patterns if for any pattern p in P there exists a more general pattern q in Q than p. Clearly more general set of patterns defines larger unions. A kminimal multiple generalization (kmmg) of a set S of strings is a minimally general set of at most k pattern...
On Learning Visual Concepts and DNF Formulae
, 1993
"... We consider the problem of learning DNF formulae in the mistakebound and the PAC models. We develop a new approach, which is called polynomial explainability, that is shown to be useful for learning some new subclasses of DNF (and CNF) formulae that were not known to be learnable before. Unlike pre ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
We consider the problem of learning DNF formulae in the mistakebound and the PAC models. We develop a new approach, which is called polynomial explainability, that is shown to be useful for learning some new subclasses of DNF (and CNF) formulae that were not known to be learnable before. Unlike previous learnability results for DNF (and CNF) formulae, these subclasses are not limited in the number of terms or in the number of variables per term; yet, they contain the subclasses of kDNF and ktermDNF (and the corresponding classes of CNF) as special cases. We apply our DNF results to the problem of learning visual concepts and obtain learning algorithms for several natural subclasses of visual concepts that appear to have no natural boolean counterpart. On the other hand, we show that learning some other natural subclasses of visual concepts is as hard as learning the class of all DNF formulae. We also consider the robustness of these results under various types of noise.