Results 11 -
18 of
18
On The Road To Improved Lexical Confusability Metrics
- In Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology
, 2002
"... Pronunciation modeling in automatic speech recognition systems has had mixed results in the past; one likely reason for poor performance is the increased confusability in the lexicon from adding new pronunciation variants. In this work, we propose a new framework for determining lexically confusable ..."
Abstract
- Add to MetaCart
Pronunciation modeling in automatic speech recognition systems has had mixed results in the past; one likely reason for poor performance is the increased confusability in the lexicon from adding new pronunciation variants. In this work, we propose a new framework for determining lexically confusable words based on inverted finite state transducers (FSTs); we also present experiments designed to test some of the implementation details of this framework. The method is evaluated by looking at how well the algorithm predicts the errors in an ASR system. We see from the confusions learned in a training set that we are able to generalize this information to predict errors in an unseen test set.
RECASTING THE DISCRIMINATIVE N-GRAM MODEL AS A PSEUDO-CONVENTIONAL N-GRAM MODEL FOR LVCSR
"... Discriminative n-gram language modeling has been used to re-rank candidate recognition hypotheses for performance improvements in large vocabulary continuous speech recognition (LVCSR). Discriminative n-gram modeling is defined in a linear framework. This work demonstrates that the linear discrimina ..."
Abstract
- Add to MetaCart
Discriminative n-gram language modeling has been used to re-rank candidate recognition hypotheses for performance improvements in large vocabulary continuous speech recognition (LVCSR). Discriminative n-gram modeling is defined in a linear framework. This work demonstrates that the linear discriminative n-gram model can be recast as a pseudo-conventional n-gram model if the order of the discriminative n-gram model is no higher than the order of the n-gram model in the baseline recognizer. Thus the power of discriminative n-gram model can be captured by mature n-gram related techniques such as single-pass n-gram decoding or lattice rescoring. This work utilizes the pseudo-conventional n-gram model to rescore the recognition lattices that are generated during decoding. Compared to the discriminative N-best re-ranking, this process of discriminative lattice rescoring (DLR) has two positive advantages: (1) Those discriminatively top-ranked utterance hypotheses within the lattice search spaces can be efficiently identified by the A * algorithm; (2) The rescored lattices can be further enhanced with other post-processing techniques to achieve cumulative improvement conveniently. Experiments with Mandarin LVCSR show that DLR improves efficiency – the computation time for 1000-best re-ranking is reduced by more than three-fold. The discriminatively rescored lattices are further processed by re-ranking with word-based mutual information (MI). While the DLR achieves around 15 % relative character error rate (CER) reductions over the recognizer baseline, the MI based re-ranking further brings 5 % relative CER reductions over the DLR performances. Index Terms—Discriminative N-gram Modeling, LVCSR 1.
Log-linear Model Combination with Word-dependent Scaling Factors
"... Log-linear model combination is the standard approach in LVCSR to combine several knowledge sources, usually an acoustic and a language model. Instead of using a single scaling factor per knowledge source, we make the scaling factor wordand pronunciation-dependent. In this work, we combine three aco ..."
Abstract
- Add to MetaCart
Log-linear model combination is the standard approach in LVCSR to combine several knowledge sources, usually an acoustic and a language model. Instead of using a single scaling factor per knowledge source, we make the scaling factor wordand pronunciation-dependent. In this work, we combine three acoustic models, a pronunciation model, and a language model for a Mandarin BN/BC task. The achieved error rate reduction of 2 % relative is small but consistent for two test sets. An analysis of the results shows that the major contribution comes from the improved interdependency of language and acoustic model. Index Terms: speech recognition, model combination, system combination, log-linear modeling, minimum risk training
Extracting Protein-Protein Interaction based on Discriminative Training of the Hidden Vector State Model
"... The knowledge about gene clusters and protein interactions is important for biological researchers to unveil the mechanism of life. However, large quantity of the knowledge often hides in the literature, ..."
Abstract
- Add to MetaCart
The knowledge about gene clusters and protein interactions is important for biological researchers to unveil the mechanism of life. However, large quantity of the knowledge often hides in the literature,
CONTINUOUS SPACE DISCRIMINATIVE LANGUAGE MODELING P.Xu a, S.Khudanpur a, M.Lehr b, E.Prud’hommeaux b, N.Glenn d, D.Karakos a, B.Roark b, K.Sagae c
"... , C.Callison-Burch a ..."
Production knowledge in the recognition of dysarthric speech
, 2011
"... Millions of individuals have acquired or have been born with neuro-motor conditions that limit the control of their muscles, including those that manipulate the articulators of the vocal tract. These conditions, collectively called dysarthria, result in speech that is very difficult to understand, d ..."
Abstract
- Add to MetaCart
Millions of individuals have acquired or have been born with neuro-motor conditions that limit the control of their muscles, including those that manipulate the articulators of the vocal tract. These conditions, collectively called dysarthria, result in speech that is very difficult to understand, despite being generally syntactically and semantically correct. This difficulty is not limited to human listeners, but also adversely affects the performance of traditional automatic speech recognition (ASR) systems, which in some cases can be completely unusable by the affected individual. This dissertation describes research into improving ASR for speakers with dysarthria by means of incorporated knowledge of their speech production. The document first introduces theoretical aspects of dysarthria and of speech production and outlines related work in these combined areas within ASR. It then describes the acquisition and analysis of the TORGO database of dysarthric articulatory motion and demonstrates several consistent behaviours among speakers in this database, including predictable pronunciation errors, for example. Articulatory data are then used to train augmented ASR systems that model the statistical relationships between vocal tract configurations and their acoustic consequences. I show that dynamic
oro.open.ac.uk Extracting Protein-Protein Interaction based on Discriminative Training of the Hidden Vector State Model
"... and other research outputs Extracting protein-protein interaction based on discriminative training of the Hidden Vctor State model Conference Item How to cite: ..."
Abstract
- Add to MetaCart
and other research outputs Extracting protein-protein interaction based on discriminative training of the Hidden Vctor State model Conference Item How to cite:
Discriminative Training of the Hidden Vector State Model for Semantic Parsing
"... and other research outputs Discriminative training of the hidden vector state model for semantic parsing Journal Article ..."
Abstract
- Add to MetaCart
and other research outputs Discriminative training of the hidden vector state model for semantic parsing Journal Article

