Results 1  10
of
10
A Unifying Approach to HTML Wrapper Representation and Learning
, 2000
"... . The number, the size, and the dynamics of Internet information sources bears abundant evidence of the need for automation in information extraction. This calls for representation formalisms that match the World Wide Web reality and for learning approaches and learnability results that apply to ..."
Abstract

Cited by 20 (9 self)
 Add to MetaCart
. The number, the size, and the dynamics of Internet information sources bears abundant evidence of the need for automation in information extraction. This calls for representation formalisms that match the World Wide Web reality and for learning approaches and learnability results that apply to these formalisms. The concept of elementary formal systems is appropriately generalized to allow for the representation of wrapper classes which are relevant to the description of Internet sources in HTML format. Related learning results prove that those wrappers are automatically learnable from examples. This is setting the stage for information extraction from the Internet by exploitation of inductive learning techniques. 1 Motivation Today's online access to millions or even billions of documents in the World Wide Web is a great challenge to research areas related to knowledge discovery and information extraction (IE). The general task of IE is to locate specific pieces of text i...
Ordinal Mind Change Complexity of Language Identification
"... The approach of ordinal mind change complexity, introduced by Freivalds and Smith, uses (notations for) constructive ordinals to bound the number of mind changes made by a learning machine. This approach provides a measure of the extent to which a learning machine has to keep revising its estimate o ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
The approach of ordinal mind change complexity, introduced by Freivalds and Smith, uses (notations for) constructive ordinals to bound the number of mind changes made by a learning machine. This approach provides a measure of the extent to which a learning machine has to keep revising its estimate of the number of mind changes it will make before converging to a correct hypothesis for languages in the class being learned. Recently, this notion, which also yields a measure for the difficulty of learning a class of languages, has been used to analyze the learnability of rich concept classes. The present paper further investigates the utility of ordinal mind change complexity. It is shown that for identification from both positive and negative data and n ≥ 1, the ordinal mind change complexity of the class of languages formed by unions of up to n + 1 pattern languages is only ω ×O notn(n) (where notn(n) is a notation for n, ω is a notation for the least limit ordinal and ×O represents ordinal multiplication). This result nicely extends an observation of Lange and Zeugmann
Elementary formal systems, intrinsic complexity, and procrastination
 Information and Computation
, 1997
"... Recently, rich subclasses of elementary formal systems (EFS) have been shown to be identifiable in the limit from only positive data. Examples of these classes are Angluin’s pattern languages, unions of pattern languages by Wright and Shinohara, and classes of languages definable by lengthbounded e ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
Recently, rich subclasses of elementary formal systems (EFS) have been shown to be identifiable in the limit from only positive data. Examples of these classes are Angluin’s pattern languages, unions of pattern languages by Wright and Shinohara, and classes of languages definable by lengthbounded elementary formal systems studied by Shinohara. The present paper employs two distinct bodies of abstract studies in the inductive inference literature to analyze the learnability of these concrete classes. The first approach, introduced by Freivalds and Smith, uses constructive ordinals to bound the number of mind changes. ω denotes the first limit ordinal. An ordinal mind change bound of ω means that identification can be carried out by a learner that after examining some element(s) of the language announces an upper bound on the number of mind changes it will make before converging; a bound of ω · 2 means that the learner reserves the right to revise this upper bound once; a bound of ω · 3 means the learner reserves the right to revise this upper bound twice, and so on. A bound of ω 2 means that identification can be carried out by a learner that announces an upper bound on the number of times it may revise its conjectured upper bound on the number of mind changes. It is shown in the present paper that the ordinal mind change complexity for identification of languages formed by unions of up to n pattern languages is ω n. It is
On Learning Unions of Pattern Languages and Tree Patterns
, 1999
"... We present efficient online algorithms for learning unions of a constant number of tree patterns, unions of a constant number of onevariable pattern languages, and unions of a constant number of pattern languages with fixed length substitutions. By fixed length substitutions we mean that each occur ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
We present efficient online algorithms for learning unions of a constant number of tree patterns, unions of a constant number of onevariable pattern languages, and unions of a constant number of pattern languages with fixed length substitutions. By fixed length substitutions we mean that each occurence of variable x i must be substituted by terminal strings of fixed length l(x i ). We prove that if an arbitrary unions of pattern languages with fixed length substitutions can be learned efficiently then DNFs are efficiently learnable in the mistake bound model. Since we use a reduction to Winnow, our algorithms are robust against attribute noise. Furthermore, they can be modified to handle concept drift. Also, our approach is quite general and may be applicable to learning other pattern related classes. For example, we could learn a more general pattern language class in which a penalty (i.e. weight) is assigned to each violation of the rule that a terminal symbol cannot be changed ...
Learning Elementary Formal Systems with Queries
, 2000
"... An elementary formal system (EFS , for short) is a kind of logic program which directly manipulates character strings. A number of researches have investigated the ability of EFS as an uniform framework for language learning in various learning models including model inference, inductive inferenc ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
An elementary formal system (EFS , for short) is a kind of logic program which directly manipulates character strings. A number of researches have investigated the ability of EFS as an uniform framework for language learning in various learning models including model inference, inductive inference, and PAClearning. In this paper, we investigate the polynomial time learnability of EFS from the view of active learning allowing membership queries. Positive results include the polynomial time learnability of the class of terminating HEFS of variableoccurrence k and arity r from equivalence queries and entailment membership queries with the information on termination. We also presented a lower bound result showing that the algorithm is near optimal in the query complexity. Negative results include a series of representationindependent hardness results, which fill the gap between the learnable and the nonlearnable subclasses of EFS in our knowledge. Particularly, we showed th...
Advanced Elementary Formal Systems
, 2001
"... An elementary formal system (EFS) is a logic program such as a Prolog program, for instance, that directly manipulates strings. Arikawa and his coworkers proposed elementary formal systems as a unifying framework for formal language learning. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
An elementary formal system (EFS) is a logic program such as a Prolog program, for instance, that directly manipulates strings. Arikawa and his coworkers proposed elementary formal systems as a unifying framework for formal language learning.
Editors
 Machine Learning: An Artificial Intelligence Approach
, 1983
"... concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com
Polynomial Time Algorithm Solving the Refutation Tree Problem for Formal Graph Systems
, 1993
"... The refutation tree problem is to compute a refutation tree which is associated with the structure of a graph generated by a formal graph system (FGS). We present subclasses of FGSs, called simple FGSs, sizebounded simple FGSs and bounded simple FGSs. In order to show that the refutation tree probl ..."
Abstract
 Add to MetaCart
(Show Context)
The refutation tree problem is to compute a refutation tree which is associated with the structure of a graph generated by a formal graph system (FGS). We present subclasses of FGSs, called simple FGSs, sizebounded simple FGSs and bounded simple FGSs. In order to show that the refutation tree problem for simple FGSs can be solved in polynomial time, we prove that the refutation tree problem for sizebounded simple FGSs is NC 2 reducible to the graph ismorphism problem. For bounded simple FGSs, we show that the refutation tree problem is in NC 2 . We present an FGS 0 GI and show that the graph isomorphism problem is logspace reducible to the graph isomorphism problem for 0 GI . 1 Introduction A formal graph system (FGS) [21] is a kind of logic programs which deals with graphs just like terms [10]. By regarding terms as trees, conventional logic programs can be directly simulated by FGSs. Moreover, an elementary formal system [2, 3, 18], that is a logic program on strings, is als...
Extending Elementary Formal Systems
"... An elementary formal system (EFS) is a logic program such as a Prolog program, for instance, that directly manipulates strings. ..."
Abstract
 Add to MetaCart
An elementary formal system (EFS) is a logic program such as a Prolog program, for instance, that directly manipulates strings.
www.elsevier.com/locate/tcs Learning elementary formal systems with queries
"... The elementary formal system (EFS) is a kind of logic programs which directly manipulates strings, and the learnability of the subclass called hereditary EFSs (HEFSs) has been investigated in the frameworks of the PAClearning, querylearning, and inductive inference models. The hierarchy of HEFS i ..."
Abstract
 Add to MetaCart
(Show Context)
The elementary formal system (EFS) is a kind of logic programs which directly manipulates strings, and the learnability of the subclass called hereditary EFSs (HEFSs) has been investigated in the frameworks of the PAClearning, querylearning, and inductive inference models. The hierarchy of HEFS is expressed by HEFS(m; k; t; r), where m; k; t and r denote the number of clauses, the occurrences of variables in the head, the number of atoms in the body, and the arity of predicate symbols. The present paper deals with the learnability of HEFS in the query learning model using equivalence queries and additional queries such as membership, predicate membership, entailment membership, and dependency queries. We show that the class HEFS(∗; k; t; r) is polynomialtime learnable with the equivalence and predicate membership queries and the class HEFS(∗; k; ∗; r) with termination property is polynomialtime learnable with the equivalence, entailment membership, and dependency queries for the unbounded parameter ∗. A lowerbound on the number of queries is presented. We also show that the class HEFS(∗; k; t; r) is hard to learn with the equivalence and membership queries under the cryptographic assumptions. Furthermore, the learnability of the class of unions of regular pattern languages, which is a subclass of HEFSs, is investigated. The bounded unions of regular pattern languages are polynomialtime predictable with membership query. However, all unbounded unions of regular pattern languages are not polynomialtime predictable with membership queries if neither are the DNF formulas. c © 2002 Published by Elsevier Science B.V. 1.