Results 1  10
of
11
On Exact Learning of Unordered Tree Patterns
 Machine Learning
, 2000
"... . Tree patterns are natural candidates for representing rules and hypotheses in many tasks such as information extraction and symbolic mathematics. A tree pattern is a tree with labeled nodes where some of the leaves may be labeled with variables, whereas a tree instance has no variables. A tree pat ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
. Tree patterns are natural candidates for representing rules and hypotheses in many tasks such as information extraction and symbolic mathematics. A tree pattern is a tree with labeled nodes where some of the leaves may be labeled with variables, whereas a tree instance has no variables. A tree pattern matches an instance if there is a consistent substitution for the variables that allows a mapping of subtrees to matching subtrees of the instance. A finite union of tree patterns is called a forest. In this paper, we study the learnability of tree patterns from queries when the subtrees are unordered. The learnability is determined by the semantics of matching as defined by the types of mappings from the pattern subtrees to the instance subtrees. We first show that unordered tree patterns and forests are not exactly learnable from equivalence and subset queries when the mapping between subtrees is onetoone onto, regardless of the computational power of the learner. Tree and forest pa...
Polynomial time algorithms for finding unordered tree patterns with internal variables
 Computer Vision Research on VisualGestural Language Data. Behavior Research Methods, Instruments, and Computers 33:3
, 2001
"... Abstract. Many documents such as Web documents or XML files have tree structures. A term tree is an unordered tree pattern consisting of internal variables and tree structures. In order to extract meaningful and hidden knowledge from such tree structured documents, we consider a minimal language (MI ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Many documents such as Web documents or XML files have tree structures. A term tree is an unordered tree pattern consisting of internal variables and tree structures. In order to extract meaningful and hidden knowledge from such tree structured documents, we consider a minimal language (MINL) problem for term trees. The MINL problem for term trees is to find a term tree t such that the language generated by t is minimal among languages, generated by term trees, which contain all given tree structured data. Firstly, we show that the MINL problem for regular term trees is computable in polynomial time if the number of edge labels is infinite. Next, we show that the MINL problems with optimizing the size of an output term tree are NPcomplete. Finally, in order to show that our polynomial time algorithm for the MINL problem can be applied to data mining from realworld Web documents, we show that regular term tree languages are polynomial time inductively inferable from positive data if the number of edge labels is infinite. 1
Residual Finite Tree Automata
 In Proceedings of the seventh int. conf. developments in Language Theory DLT’03, number 2710 in Lecture Notes in Computer Science
, 2003
"... Tree automata based algorithms are essential in many fields in computer science such as verification, specification, program analysis. They become also essential for databases with... ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Tree automata based algorithms are essential in many fields in computer science such as verification, specification, program analysis. They become also essential for databases with...
Mining Probabilistic Tree Patterns in a Medical Database
, 2002
"... We propose a contribution to the PKDD2002 discovery challenge on the hepatitis dataset. This challenge aims at discovering regularities over patients strucked down by chronic hepatitis. Our approach addresses the problem of multirelational Data Mining, extracting probabilistic tree patterns fr ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We propose a contribution to the PKDD2002 discovery challenge on the hepatitis dataset. This challenge aims at discovering regularities over patients strucked down by chronic hepatitis. Our approach addresses the problem of multirelational Data Mining, extracting probabilistic tree patterns from a database using Grammatical Inference techniques.
Probabilistic Approach for Reduction of Irrelevant Treestructured Data
"... Abstract. This article aims at pruning noisy or irrelevant subtrees in a set of trees. The originality of this approach, in comparison with classic techniques in prototype selection, comes not from the nondeletion of the whole tree, but rather of some of its subtrees. Our method is based on the com ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. This article aims at pruning noisy or irrelevant subtrees in a set of trees. The originality of this approach, in comparison with classic techniques in prototype selection, comes not from the nondeletion of the whole tree, but rather of some of its subtrees. Our method is based on the computation of confidence intervals on a set of subtrees according to a probability distribution. We propose an approach to assess these intervals on this specific type of data and show experimentally its interest in the context of learning from noisy data.
Patterns
 EATCS Bulletin
, 2003
"... We review topics on formal language aspects of patterns. The main results on the equivalence and inclusion problems are presented. We discuss open problems, in particular, concerning pattern language decision problems and ambiguity in patterns. ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We review topics on formal language aspects of patterns. The main results on the equivalence and inclusion problems are presented. We discuss open problems, in particular, concerning pattern language decision problems and ambiguity in patterns.
Learning a Subclass of Regular Patterns in Polynomial Time
, 2007
"... An algorithm for learning a subclass of erasing regular pattern languages is presented. On extended regular pattern languages generated by patterns π of the form x0α1x1... αmxm, where x0,..., xm are variables and α1,..., αm strings of terminals of length c each, it runs with arbitrarily high probabi ..."
Abstract
 Add to MetaCart
An algorithm for learning a subclass of erasing regular pattern languages is presented. On extended regular pattern languages generated by patterns π of the form x0α1x1... αmxm, where x0,..., xm are variables and α1,..., αm strings of terminals of length c each, it runs with arbitrarily high probability of success using a number of examples polynomial in m (and exponential in c). It is assumed that m is unknown, but c is known and that samples are randomly drawn according to some distribution, for which we only require that it has certain natural and plausible properties. Aiming to improve this algorithm further we also explore computer simulations of a heuristic.
MultiRelational Data Mining in Medical Databases
"... Abstract. This paper presents the application of a method for mining data in a multirelational database that contains some information about patients strucked down by chronic hepatitis. Our approach may be used on any kind of multirelational database and aims at extracting probabilistic tree patter ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. This paper presents the application of a method for mining data in a multirelational database that contains some information about patients strucked down by chronic hepatitis. Our approach may be used on any kind of multirelational database and aims at extracting probabilistic tree patterns from a database using Grammatical Inference techniques. We propose to use a representation of the database by trees in order to extract these patterns. Trees provide a natural way to represent structured information taking into account the statistical distribution of the data. In this work we try to show how they can be useful for interpreting knowledge in the medical domain. 1
Advances in Learning Formal Languages
"... Abstract—we present an overview in the advances related to the learning of formal languages i.e. development in the grammatical inference research. The problem of learning correct grammars for the unknown languages is known as grammatical inference. It is considered a main subject of inductive infer ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—we present an overview in the advances related to the learning of formal languages i.e. development in the grammatical inference research. The problem of learning correct grammars for the unknown languages is known as grammatical inference. It is considered a main subject of inductive inference, and grammars are important representations to be investigated in machine learning from both theoretical and practical points of view. Application area of grammatical inference is increasing day by day, and it is still required to find a task where grammatical inference models have done much better than other machine learning or pattern recognition programs. However, it is known that making research in this area is a computationally hard problem. This paper mainly explores the area, its applications, various learning paradigms, the case of contextfree grammars, challenges, recent trends etc., and cites the important literature on these. Index Terms — machine learning, grammatical inference, learning model, formal language, contextfree grammars D I.
Learning Languages in a Union
"... In inductive inference, a machine is given words of a language (a recursively enumerable set in our setting) and the machine is said to identify the language if it correctly names the language. In this paper we study identifiability of classes of languages where the unions of up to a fixed number (n ..."
Abstract
 Add to MetaCart
In inductive inference, a machine is given words of a language (a recursively enumerable set in our setting) and the machine is said to identify the language if it correctly names the language. In this paper we study identifiability of classes of languages where the unions of up to a fixed number (n say) of languages from the class are provided as input. We distinguish between two different scenarios: in one scenario, the learner need only to name the language which results from the union; in the other, the learner must individually name the languages which make up the union (we say that the unioned language is discerningly identified). We define three kinds of identification criteria based on this and by the use of some classes of disjoint languages, demonstrate that the inferring power of each of these identification criterion decreases as we increase the number of languages allowed in the union, thus resulting in an infinite hierarchy for each identification criterion. That is, we show that for each n, there exists a class of disjoint languages where all unions of up to n languages from this class can be discerningly identified, but there is no learner which identifies every union of n+1 languages from this class. A comparison between the different identification criteria also yielded similar hierarchies. We give sufficient conditions for classes of languages where the unions can be discerningly identified, and characterize such discerning learnability for the indexed families. We then give naturally occurring classes of languages that witness some of the earlier hierarchical results. Finally, we present language classes which are complete with respect to weak reduction (in terms of intrinsic complexity) for our identification criteria.