Results 1  10
of
32
S.: Hidden Markov Model Induction by Bayesian Model Merging
 Advances in Neural Information Processing Systems 5
, 1993
"... This paper describes a technique for learning both the number of states and the topology of Hidden Markov Models from examples. The induction process starts with the most specific model consistent with the training data and generalizes by successively merging states. Both the choice of states to mer ..."
Abstract

Cited by 135 (2 self)
 Add to MetaCart
This paper describes a technique for learning both the number of states and the topology of Hidden Markov Models from examples. The induction process starts with the most specific model consistent with the training data and generalizes by successively merging states. Both the choice of states to merge and the stopping criterion are guided by the Bayesian posterior probability. We compare our algorithm with the BaumWelch method of estimating fixedsize models, and find that it can induce minimal HMMs from data in cases where fixed estimation does not converge or requires redundant parameters to converge. 1
Inductive Inference, DFAs and Computational Complexity
 2nd Int. Workshop on Analogical and Inductive Inference (AII
, 1989
"... This paper surveys recent results concerning the inference of deterministic finite automata (DFAs). The results discussed determine the extent to which DFAs can be feasibly inferred, and highlight a number of interesting approaches in computational learning theory. 1 ..."
Abstract

Cited by 78 (1 self)
 Add to MetaCart
This paper surveys recent results concerning the inference of deterministic finite automata (DFAs). The results discussed determine the extent to which DFAs can be feasibly inferred, and highlight a number of interesting approaches in computational learning theory. 1
Types of Monotonic Language Learning and Their Characterization
 in "Proceedings 5th Annual ACM Workshop on Computational Learning Theory," July 27  29, Pittsburgh
, 1992
"... The present paper deals with strongmonotonic, monotonic and weakmonotonic language learning from positive data as well as from positive and negative examples. The three notions of monotonicity reflect different formalizations of the requirement that the learner has to produce always better and b ..."
Abstract

Cited by 32 (26 self)
 Add to MetaCart
The present paper deals with strongmonotonic, monotonic and weakmonotonic language learning from positive data as well as from positive and negative examples. The three notions of monotonicity reflect different formalizations of the requirement that the learner has to produce always better and better generalizations when fed more and more data on the concept to be learnt. We characterize strong monotonic, monotonic, weakmonotonic and finite language learning from positive data in terms of recursively generable finite sets, thereby solving a problem of Angluin (1980). Moreover, we study monotonic inference with iteratively working learning devices which are of special interest in applications. In particular, it is proved that strongmonotonic inference can be performed with iteratively learning devices without limiting the inference capabilities, while monotonic and weakmonotonic inference cannot. 1 Introduction The process of hypothesizing a general rule from eventually inc...
Automatic generation of subword units for speech recognition systems
 IEEE Transactions on Speech and Audio Processing
"... Abstract—Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The perfo ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
Abstract—Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The performance of the LVCSR system depends critically on the definition of the subword units and the accuracy of the dictionary. In current LVCSR systems, both these components are manually designed. While manually designed subword units generalize well, they may not be the optimal units of classification for the specific task or environment for which an LVCSR system is trained. Moreover, when human expertise is not available, it may not be possible to design good subword units manually. There is clearly a need for datadriven design of these LVCSR components. In this paper, we present a complete probabilistic formulation for the automatic design of subword units and dictionary, given only the acoustic data and their transcriptions. The proposed framework permits easy incorporation of external sources of information, such as the spellings of words in terms of a nonideographic script. Index Terms—Learning, lexical representation, maximumlikelihood, speech recognition, subword units.
Passively Learning Finite Automata
, 1996
"... We provide a survey of methods for inferring the structure of a finite automaton from passive observation of its behavior. We consider both deterministic automata and probabilistic automata (similar to Hidden Markov Models). While it is computationally intractible to solve the general problem exactl ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
We provide a survey of methods for inferring the structure of a finite automaton from passive observation of its behavior. We consider both deterministic automata and probabilistic automata (similar to Hidden Markov Models). While it is computationally intractible to solve the general problem exactly, we will consider heuristic algorithms, and also special cases which are tractible. Most of the algorithms we consider are based on the idea of building a tree which encodes all of the examples we have seen, and then merging equivalent nodes to produce a (near) minimal automaton. Contents 1 Introduction 4 1.1 Applications of automaton inference : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Why PFAs instead of other probabilistic models? : : : : : : : : : : : : : : : : : : : : : : : : : 5 1.3 The input to/output from the algorithms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 1.4 Batch vs. online algorithms : : : : : : : : : : : : : : : : : : : : : :...
Grammar Inference, Automata Induction, and Language Acquisition
 Handbook of Natural Language Processing
, 2000
"... The natural language learning problem has attracted the attention of researchers for several decades. Computational and formal models of language acquisition have provided some preliminary, yet promising insights of how children learn the language of their community. Further, these formal models als ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
The natural language learning problem has attracted the attention of researchers for several decades. Computational and formal models of language acquisition have provided some preliminary, yet promising insights of how children learn the language of their community. Further, these formal models also provide an operational framework for the numerous practical applications of language learning. We will survey some of the key results in formal language learning. In particular, we will discuss the prominent computational approaches for learning different classes of formal languages and discuss how these fit in the broad context of natural language learning.
Ignoring Data May be the Only Way to Learn Efficiently
, 1994
"... In designing learning algorithms it seems quite reasonable to construct them in a way such that all data the algorithm already has obtained are correctly and completely reflected in the hypothesis the algorithm outputs on these data. However, this approach may totally fail, i.e., it may lead to t ..."
Abstract

Cited by 19 (13 self)
 Add to MetaCart
In designing learning algorithms it seems quite reasonable to construct them in a way such that all data the algorithm already has obtained are correctly and completely reflected in the hypothesis the algorithm outputs on these data. However, this approach may totally fail, i.e., it may lead to the unsolvability of the learning problem, or it may exclude any efficient solution of it. In particular, we present a natural learning problem and prove that it can be solved in polynomial time if and only if the algorithm is allowed to ignore data.
Getting order independence in incremental learning
 Proc. European Conference on Machine Learning 1993, (P.B. Brazdil, Ed.), Lecture Notes in Artificial Intelligence 667
, 1993
"... Abstract. It is empirically known that most incremental learning systems are order dependent, i.e. provide results that depend on the particular order of the data presentation. This paper aims at uncovering the reasons behind this, and at specifying the conditions that would guarantee order independ ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Abstract. It is empirically known that most incremental learning systems are order dependent, i.e. provide results that depend on the particular order of the data presentation. This paper aims at uncovering the reasons behind this, and at specifying the conditions that would guarantee order independence. It is shown that both an optimality and a storage criteria are sufficient for ensuring order independence. Given that these correspond to very strong requirements however, it is interesting to study necessary, hopefully less stringent, conditions. The results obtained prove that these necessary conditions are equally difficult to meet in practice. Besides its main outcome, this paper provides an interesting method to transform an history dependent bias into an history independent one. 1
An Incremental Interactive Algorithm for Regular Grammar Inference
 Proceedings of the Third ICGI96
, 1996
"... . We present provably correct interactive algorithms for learning regular grammars from positive examples and membership queries. A structurally complete set of strings from a language L(G) corresponding to a target regular grammar G implicitly specifies a lattice of finite state automata (FSA) wh ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
. We present provably correct interactive algorithms for learning regular grammars from positive examples and membership queries. A structurally complete set of strings from a language L(G) corresponding to a target regular grammar G implicitly specifies a lattice of finite state automata (FSA) which contains a FSA MG corresponding to G. The lattice is compactly represented as a versionspace and MG is identified by searching the versionspace using membership queries. We explore the problem of regular grammar inference in a setting where positive examples are provided intermittently. We provide an incremental version of the algorithm along with a set of sufficient conditions for its convergence. 1 Introduction Regular Grammar Inference [3, 5, 9, 12] is an important machine learning problem with applications in pattern recognition and language acquisition. It is defined as the process of learning an unknown regular grammar (G) given a finite set of positive examples S + , possibly...