Results 1  10
of
122
A tutorial on hidden Markov models and selected applications in speech recognition
 PROCEEDINGS OF THE IEEE
, 1989
"... Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical s ..."
Abstract

Cited by 5308 (1 self)
 Add to MetaCart
Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical structure and hence can form the theoretical basis for use in a wide range of applications. Second the models, when applied properly, work very well in practice for several important applications. In this paper we attempt to carefully and methodically review the theoretical aspects of this type of statistical modeling and show how they have been applied to selected problems in machine recognition of speech.
A maximum likelihood approach to continuous speech recognition
 IEEE Trans. Pattern Anal. Machine Intell
, 1983
"... AbstractSpeech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the ..."
Abstract

Cited by 448 (9 self)
 Add to MetaCart
AbstractSpeech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them. Index TermsMarkov models, maximum likelihood, parameter estimation, speech recognition, statistical models. I.
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 228 (5 self)
 Add to MetaCart
(Show Context)
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
Continuous Speech Recognition by Statistical Methods
 Proceedings of the IEEE 64
, 1976
"... HIS PAPER DESCRIBES statistical methods of automatic recognition (transcription) of continuous speech that have been used successfully by the Speech Processing Group at the IBM Thomas J. Watson Research Center. The ..."
Abstract

Cited by 213 (1 self)
 Add to MetaCart
HIS PAPER DESCRIBES statistical methods of automatic recognition (transcription) of continuous speech that have been used successfully by the Speech Processing Group at the IBM Thomas J. Watson Research Center. The
The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length
 Machine Learning
, 1996
"... . We propose and analyze a distribution learning algorithm for variable memory length Markov processes. These processes can be described by a subclass of probabilistic finite automata which we name Probabilistic Suffix Automata (PSA). Though hardness results are known for learning distributions gene ..."
Abstract

Cited by 208 (17 self)
 Add to MetaCart
(Show Context)
. We propose and analyze a distribution learning algorithm for variable memory length Markov processes. These processes can be described by a subclass of probabilistic finite automata which we name Probabilistic Suffix Automata (PSA). Though hardness results are known for learning distributions generated by general probabilistic automata, we prove that the algorithm we present can efficiently learn distributions generated by PSAs. In particular, we show that for any target PSA, the KLdivergence between the distribution generated by the target and the distribution generated by the hypothesis the learning algorithm outputs, can be made small with high confidence in polynomial time and sample complexity. The learning algorithm is motivated by applications in humanmachine interaction. Here we present two applications of the algorithm. In the first one we apply the algorithm in order to construct a model of the English language, and use this model to correct corrupted text. In the second ...
Fast Decoding and Optimal Decoding for Machine Translation
 In Proceedings of ACL 39
, 2001
"... A good decoding algorithm is critical ..."
(Show Context)
PartofSpeech Tagging and Partial Parsing
 CorpusBased Methods in Language and Speech
, 1996
"... m we can carve o# next. `Partial parsing' is a cover term for a range of di#erent techniques for recovering some but not all of the information contained in a traditional syntactic analysis. Partial parsing techniques, like tagging techniques, aim for reliability and robustness in the face of t ..."
Abstract

Cited by 108 (0 self)
 Add to MetaCart
m we can carve o# next. `Partial parsing' is a cover term for a range of di#erent techniques for recovering some but not all of the information contained in a traditional syntactic analysis. Partial parsing techniques, like tagging techniques, aim for reliability and robustness in the face of the vagaries of natural text, by sacrificing completeness of analysis and accepting a low but nonzero error rate. 1 Tagging The earliest taggers [35, 51] had large sets of handconstructed rules for assigning tags on the basis of words' character patterns and on the basis of the tags assigned to preceding or following words, but they had only small lexica, primarily for exceptions to the rules. TAGGIT [35] was used to generate an initial tagging of the Brown corpus, which was then handedited. (Thus it provided the data that has since been used to train other taggers [20].) The tagger described by Garside [56, 34], CLAWS, was a probabilistic version of TAGGIT, and the DeRose tagger improved on
Connectionist Probability Estimation in HMM Speech Recognition
 IEEE Transactions on Speech and Audio Processing
, 1992
"... This report is concerned with integrating connectionist networks into a hidden Markov model (HMM) speech recognition system, This is achieved through a statistical understanding of connectionist networks as probability estimators, first elucidated by Herve Bourlard. We review the basis of HMM speech ..."
Abstract

Cited by 82 (23 self)
 Add to MetaCart
(Show Context)
This report is concerned with integrating connectionist networks into a hidden Markov model (HMM) speech recognition system, This is achieved through a statistical understanding of connectionist networks as probability estimators, first elucidated by Herve Bourlard. We review the basis of HMM speech recognition, and point out the possible benefits of incorporating connectionist networks. We discuss some issues necessary to the construction of a connectionist HMM recognition system, and describe the performance of such a system, including evaluations on the DARPA database, in collaboration with Mike Cohen and Horacio Franco of SRI International. In conclusion, we show that a connectionist component improves a state of the art HMM system. ii Part I INTRODUCTION Over the past few years, connectionist models have been widely proposed as a potentially powerful approach to speech recognition (e.g. Makino et al. (1983), Huang et al. (1988) and Waibel et al. (1989)). However, whilst connec...
A unified framework for tree search decoding: Rediscovering the sequential decoder
 IEEE Transactions on Information Theory
, 2006
"... ..."
(Show Context)
The Candide system for machine translation
 In Proceedings of the ARPA Conference on Human Language Technology
, 1994
"... We present an overview of Candide, a system for automatic translation of French text to English text. Candide uses methods of information theory and statistics to develop a probability model of the translation process. This model, which is made to accord as closely as possible with a large body of F ..."
Abstract

Cited by 59 (5 self)
 Add to MetaCart
(Show Context)
We present an overview of Candide, a system for automatic translation of French text to English text. Candide uses methods of information theory and statistics to develop a probability model of the translation process. This model, which is made to accord as closely as possible with a large body of French and English sentence pairs, is then used to generate English translations of previously unseen French sentences. This paper provides a tutorial in these methods, discussions of the training and operation of the system, and a summary of test results. 1.