Results 1  10
of
189
A tutorial on hidden Markov models and selected applications in speech recognition
 PROCEEDINGS OF THE IEEE
, 1989
"... Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical s ..."
Abstract

Cited by 5777 (1 self)
 Add to MetaCart
(Show Context)
Although initially introduced and studied in the late 1960s and early 1970s, statistical methods of Markov source or hidden Markov modeling have become increasingly popular in the last several years. There are two strong reasons why this has occurred. First the models are very rich in mathematical structure and hence can form the theoretical basis for use in a wide range of applications. Second the models, when applied properly, work very well in practice for several important applications. In this paper we attempt to carefully and methodically review the theoretical aspects of this type of statistical modeling and show how they have been applied to selected problems in machine recognition of speech.
From HMM's to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition
, 1996
"... ..."
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 259 (5 self)
 Add to MetaCart
(Show Context)
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
Simultaneous Modeling Of Spectrum, Pitch And Duration In HMMBased Speech Synthesis
, 1999
"... In this paper, we describe an HMMbased speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM. In the system, pitch and state duration are modeled by multispace probability distribution HMMs and multi dimensional Gaussian distr ..."
Abstract

Cited by 168 (37 self)
 Add to MetaCart
In this paper, we describe an HMMbased speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM. In the system, pitch and state duration are modeled by multispace probability distribution HMMs and multi dimensional Gaussian distributions, respectively. The distributions for spectral parameter, pitch parameter and the state duration are clustered independently by using a decisiontree based context clustering technique. Synthetic speech is generated by using an speech parameter generation algorithm from HMM and a melcepstrum based vocoding technique. Through informal listening tests, we have confirmed that the proposed system successfully synthesizes naturalsounding speech which resembles the speaker in the training database.
An overview of the SPHINX speech recognition system
, 1990
"... AbstractSpeaker independence, continuous speech, and large vocabularies pose three of the greatest challenges in automatic speech recognition. Previously, accurate speech recognizers avoided dealing simultaneously with all three problems. This paper describes SPHINX, a system that demonstrates the ..."
Abstract

Cited by 159 (8 self)
 Add to MetaCart
AbstractSpeaker independence, continuous speech, and large vocabularies pose three of the greatest challenges in automatic speech recognition. Previously, accurate speech recognizers avoided dealing simultaneously with all three problems. This paper describes SPHINX, a system that demonstrates the feasibility of accurate, largevocabulary speakerindependent, continuous speech recognition. SPHINX is based on discrete hidden Markov models (HMM’s) with LPCderived parameters. To provide speaker independence, we added knowledge to these HMM’s in several ways: multiple codebooks of fixedwidth parameters, and an enhanced recogniar with carefully designed models and word duration modeling. To deal with coarticulation in continuous speech, yet still adequately represent a large vocabulary, we introduce two new subword speech unitsfunctionworddependent phone models and generaliied triphone models. With grammars of perplexity 997, 60, and 20, SPHINX attained word accuracies
The Use of Context in Large Vocabulary Speech Recognition
, 1995
"... decide which contexts are similar and can share parameters. A key feature of this approach is that it allows the construction of models which are dependent upon contextual effects occurring across word boundaries. The use of cross word context dependent models presents problems for conventional dec ..."
Abstract

Cited by 156 (0 self)
 Add to MetaCart
decide which contexts are similar and can share parameters. A key feature of this approach is that it allows the construction of models which are dependent upon contextual effects occurring across word boundaries. The use of cross word context dependent models presents problems for conventional decoders. The second part of the thesis therefore presents a new decoder design which is capable of using these models efficiently. The decoder is suitable for use with very large vocabularies and long span language models. It is also capable of generating a lattice of word hypotheses with little computational overhead. These lattices can be used to constrain further decoding, allowing efficient use of complex acoustic and language models. The effectiveness of these techniques has been assessed on a variety of large vocabulary continuous speech recognition tasks and results are presented which analyse performance in terms of computational complexity and recognition accuracy. The experiments dem
Connectionist Probability Estimation in HMM Speech Recognition
 IEEE Transactions on Speech and Audio Processing
, 1992
"... This report is concerned with integrating connectionist networks into a hidden Markov model (HMM) speech recognition system, This is achieved through a statistical understanding of connectionist networks as probability estimators, first elucidated by Herve Bourlard. We review the basis of HMM speech ..."
Abstract

Cited by 92 (24 self)
 Add to MetaCart
(Show Context)
This report is concerned with integrating connectionist networks into a hidden Markov model (HMM) speech recognition system, This is achieved through a statistical understanding of connectionist networks as probability estimators, first elucidated by Herve Bourlard. We review the basis of HMM speech recognition, and point out the possible benefits of incorporating connectionist networks. We discuss some issues necessary to the construction of a connectionist HMM recognition system, and describe the performance of such a system, including evaluations on the DARPA database, in collaboration with Mike Cohen and Horacio Franco of SRI International. In conclusion, we show that a connectionist component improves a state of the art HMM system. ii Part I INTRODUCTION Over the past few years, connectionist models have been widely proposed as a potentially powerful approach to speech recognition (e.g. Makino et al. (1983), Huang et al. (1988) and Waibel et al. (1989)). However, whilst connec...
Analysis of speaker adaptation algorithms for HMMbased speech synthesis and a constrained SMAPLR adaptation algorithm
 IEEE Trans. Audio Speech Lang. Process
, 2009
"... Abstract—In this paper, we analyze the effects of several factors and configuration choices encountered during training and model construction when we want to obtain better and more stable adaptation in HMMbased speech synthesis. We then propose a new adaptation algorithm called constrained structu ..."
Abstract

Cited by 87 (28 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, we analyze the effects of several factors and configuration choices encountered during training and model construction when we want to obtain better and more stable adaptation in HMMbased speech synthesis. We then propose a new adaptation algorithm called constrained structural maximum a posteriori linear regression (CSMAPLR) whose derivation is based on the knowledge obtained in this analysis and on the results of comparing several conventional adaptation algorithms. Here, we investigate six major aspects of the speaker adaptation: initial models; the amount of the training data for the initial models; the transform functions, estimation criteria, and sensitivity of several linear regression adaptation algorithms; and combination algorithms. Analyzing the effect of the initial model, we compare speakerdependent models, genderindependent models, and the simultaneous use of the genderdependent models to single use of the genderdependent models. Analyzing the effect of the transform functions, we compare the transform function for only mean vectors with that for mean vectors and covariance matrices. Analyzing the effect of the estimation criteria, we compare the ML criterion with a robust estimation criterion called structural MAP. We evaluate the sensitivity of several thresholds for the piecewise linear regression algorithms and take up methods combining MAP adaptation with the linear regression algorithms. We incorporate these adaptation algorithms into our speech synthesis system and present several subjective and objective evaluation results showing the utility and effectiveness of these algorithms in speaker adaptation for HMMbased speech synthesis. Index Terms—Average voice, hidden Markov model (HMM)based speech synthesis, speaker adaptation, speech synthesis, voice conversion. I.
QuantiSNP: An objective Bayes HiddenMarkov Model to detect and accurately map copy number variation using SNP genotyping data
 Nucleic Acids Research
, 2007
"... Arraybased technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally valida ..."
Abstract

Cited by 82 (1 self)
 Add to MetaCart
(Show Context)
Arraybased technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray TM SNP genotyping data using an Objective Bayes HiddenMarkov Model (OBHMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel resampling framework to calibrate the
Towards a mathematical theory of cortical microcircuits. PLoS computational biology 5
, 2009
"... The theoretical setting of hierarchical Bayesian inference is gaining acceptance as a framework for understanding cortical computation. In this paper, we describe how Bayesian belief propagation in a spatiotemporal hierarchical model, called Hierarchical Temporal Memory (HTM), can lead to a mathema ..."
Abstract

Cited by 64 (0 self)
 Add to MetaCart
(Show Context)
The theoretical setting of hierarchical Bayesian inference is gaining acceptance as a framework for understanding cortical computation. In this paper, we describe how Bayesian belief propagation in a spatiotemporal hierarchical model, called Hierarchical Temporal Memory (HTM), can lead to a mathematical model for cortical circuits. An HTM node is abstracted using a coincidence detector and a mixture of Markov chains. Bayesian belief propagation equations for such an HTM node define a set of functional constraints for a neuronal implementation. Anatomical data provide a contrasting set of organizational constraints. The combination of these two constraints suggests a theoretically derived interpretation for many anatomical and physiological features and predicts several others. We describe the pattern recognition capabilities of HTM networks and demonstrate the application of the derived circuits for modeling the subjective contour effect. We also discuss how the theory and the circuit can be extended to explain cortical features that are not explained by the current model and describe testable predictions that can be derived from the model.