Results 1  10
of
13
Learning String Edit Distance
, 1997
"... In many applications, it is necessary to determine the similarity of two strings. A widelyused notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic mo ..."
Abstract

Cited by 198 (2 self)
 Add to MetaCart
In many applications, it is necessary to determine the similarity of two strings. A widelyused notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic model for string edit distance. Our stochastic model allows us to learn a string edit distance function from a corpus of examples. We illustrate the utility of our approach by applying it to the difficult problem of learning the pronunciation of words in conversational speech. In this application, we learn a string edit distance with nearly one fifth the error rate of the untrained Levenshtein distance. Our approach is applicable to any string classification problem that may be solved using a similarity function against a database of labeled prototypes.
The Design Principles of a Weighted FiniteState Transducer Library
 THEORETICAL COMPUTER SCIENCE
, 2000
"... We describe the algorithmic and software design principles of an objectoriented library for weighted finitestate transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive eff ..."
Abstract

Cited by 99 (22 self)
 Add to MetaCart
We describe the algorithmic and software design principles of an objectoriented library for weighted finitestate transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive efficiency in demanding speech processing applications involving weighted automata of more than 10^7 states and transitions. Besides its mathematical foundation, the design also draws from important ideas in algorithm design and programming languages: dynamic programming and shortestpaths algorithms over general semirings, objectoriented programming, lazy evaluation and memoization.
A Rational Design for a Weighted FiniteState Transducer Library
 LECTURE NOTES IN COMPUTER SCIENCE
, 1998
"... ..."
A conditional random field for discriminativelytrained finitestate string edit distance
 In Conference on Uncertainty in AI (UAI
, 2005
"... The need to measure sequence similarity arises in information extraction, object identity, data mining, biological sequence analysis, and other domains. This paper presents discriminative stringedit CRFs, a finitestate conditional random field model for edit sequences between strings. Conditional r ..."
Abstract

Cited by 51 (7 self)
 Add to MetaCart
The need to measure sequence similarity arises in information extraction, object identity, data mining, biological sequence analysis, and other domains. This paper presents discriminative stringedit CRFs, a finitestate conditional random field model for edit sequences between strings. Conditional random fields have advantages over generative approaches to this problem, such as pair HMMs or the work of Ristad and Yianilos, because as conditionallytrained methods, they enable the use of complex, arbitrary actions and features of the input strings. As in generative models, the training data does not have to specify the edit sequences between the given string pairs. Unlike generative models, however, our model is trained on both positive and negative instances of string pairs. We present positive experimental results on several data sets. 1
Learning Stochastic Edit Distance: application in handwritten character recognition
"... Many pattern recognition algorithms are based on the nearest neighbour search and use the well known edit distance, for which the primitive edit costs are usually fixed in advance. In this article, we aim at learning an unbiased stochastic edit distance in the form of a finitestate transducer from ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
Many pattern recognition algorithms are based on the nearest neighbour search and use the well known edit distance, for which the primitive edit costs are usually fixed in advance. In this article, we aim at learning an unbiased stochastic edit distance in the form of a finitestate transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods, which generally use the Expectation Maximisation algorithm, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimise the parameters of a conditional transducer instead of a joint one. We apply our new model in the context of handwritten digit recognition. We show, carrying out a large series of experiments, that it always outperforms the standard edit distance. Key words: Stochastic Edit Distance, FiniteState Transducers, Handwritten character recognition.
Partially Supervised Learning of Morphology with Stochastic Transducers
"... In this paper I present an algorithm for the unsupervised learning of morphology using stochastic finite state transducers, in particular Pair Hidden Markov Models. The task is viewed as an alignment problem between two sets of words. A supervised model of morphology acquisition is converted to an u ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
In this paper I present an algorithm for the unsupervised learning of morphology using stochastic finite state transducers, in particular Pair Hidden Markov Models. The task is viewed as an alignment problem between two sets of words. A supervised model of morphology acquisition is converted to an unsupervised model by treating the alignment as a further hidden variable. The use of the ExpectationMaximisation algorithm for this task is studied, which leads to calculations involving the permanent of a matrix of probabilities.
Topics In Computational Hidden State Modeling
, 1997
"... Motivated by the goal of establishing stochastic and information theoretic foundations for the study of intelligence and synthesis of intelligent machines, this thesis probes several topics relating to hidden state stochastic models. Finite Growth Models (FGM) are introduced. These are nonnegative f ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Motivated by the goal of establishing stochastic and information theoretic foundations for the study of intelligence and synthesis of intelligent machines, this thesis probes several topics relating to hidden state stochastic models. Finite Growth Models (FGM) are introduced. These are nonnegative functionals that arise from parametricallyweighted directed acyclic graphs and a tuple observation that affects these weights. Using FGMs the parameters of a highly general form of stochastic transducer can be learned from examples, and the particular case of stochastic string edit distance is developed. Experiments are described that illustrate the application of learned string edit distance to the problem of recognizing a spoken word given a phonetic transcription of the acoustic signal. With FGMs one may direct learning by criteria beyond simple maximumlikelihood. The MAP (maximum a posteriori estimate) and MDL (minimum description length) are discussed along with the application to cau...
Learning unbiased stochastic edit distance in the form of a memoryless finitestate transducer
 International Joint Conference on Machine Learning (2005). Workshop: Grammatical Inference Applications: Successes and Future Challenges
"... We aim at learning an unbiased stochastic edit distance in the form of a finitestate transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods, which generally use the algorithm Expectation Maximization, our algorithm learns a transducer independently on th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We aim at learning an unbiased stochastic edit distance in the form of a finitestate transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods, which generally use the algorithm Expectation Maximization, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimize the parameters of a conditional transducer instead of a joint one. This transducer can be very useful in many domains of pattern recognition and machine learning, such as noise management, or DNA alignment. Several experiments are carried out with our algorithm showing that it is able to correctly assess theoretical target distributions. 1
Learning String Edit Distance 1
, 1996
"... In many applications, it is necessary to determine the similarity of two strings. A widelyused notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic mo ..."
Abstract
 Add to MetaCart
In many applications, it is necessary to determine the similarity of two strings. A widelyused notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one string into the other. In this report, we provide a stochastic model for string edit distance. Our stochastic model allows us to learn the optimal string edit distance function from a corpus of examples. We illustrate the utility of our approach by applying it to the di cult problem of learning the pronunciation of words in conversational speech. In this application, we learn a string edit distance function with one third the error rate of the untrained Levenshtein distance.
A General Decomposition Theorem that Extends the BaumWelch and ExpectationMaximization Paradigm to Rational Forms
, 2001
"... We consider the problem of maximizing certain positive rational functions of a form that includes statistical constructs such as conditional mixture densities and conditional hidden Markov models. The wellknown BaumWelch and expectation maximization (EM) algorithms do not apply to rational function ..."
Abstract
 Add to MetaCart
We consider the problem of maximizing certain positive rational functions of a form that includes statistical constructs such as conditional mixture densities and conditional hidden Markov models. The wellknown BaumWelch and expectation maximization (EM) algorithms do not apply to rational functions and are therefore limited to the simpler maximumlikelihood form of such models. Our main result is a general decomposition theorem that like BaumWelch/EM, breaks up each iteration of the maximization task into independent subproblems that are more easily solved – but applies to rational functions as well. It extends the central inequality of BaumWelch/EM and associated highlevel algorithms to the rational case, and reduces to the standard inequality and algorithms for simpler problems. Keywords: BaumWelch (forward backward algorithm), Expectation Maximization (EM), hidden Markov models (HMM), conditional mixture density estimation, discriminative training, Maximum Mutual Information (MMI) Criterion. 1