Results 11 - 20
of
35
Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition. In: Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition
, 2006
"... Minimum risk estimation and decoding strategies based on lattice segmentation techniques can be used to refine large vocabulary continuous speech recognition systems through the estimation of the parameters of the underlying hidden Mark models and through the identification of smaller recognition ta ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Minimum risk estimation and decoding strategies based on lattice segmentation techniques can be used to refine large vocabulary continuous speech recognition systems through the estimation of the parameters of the underlying hidden Mark models and through the identification of smaller recognition tasks which provides the opportunity to incorporate novel modeling and decoding procedures in LVCSR. These techniques are discussed in the context of going ‘beyond HMMs’. 1.
Arc Minimization in Finite State Decoding Graphs with Cross-Word Acoustic Context
- In Proc. ICSLP’02
, 2002
"... Recent approaches to large vocabulary decoding with finite state graphs have focused on the use of state minimization algorithms to produce relatively compact graphs. This paper extends the finite state approach by developing complementary arc-minimization techniques. The use of these techniques in ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Recent approaches to large vocabulary decoding with finite state graphs have focused on the use of state minimization algorithms to produce relatively compact graphs. This paper extends the finite state approach by developing complementary arc-minimization techniques. The use of these techniques in concert with state minimization allows us to statically compile decoding graphs in which the acoustic models utilize a full word of cross-word context. This is in significant contrast to typical systems which use only a single phone. We show that the particular arc-minimization problem that arises is in fact an NP-complete combinatorial optimization problem, and describe the reduction from 3-SAT. We present experimental results that illustrate the moderate sizes and runtimes of graphs for the Switchboard task. 1.
Structured Queries, Language Modeling, and Relevance Modeling in Cross-Language Information Retrieval
- Information Processing and Management Special Issue on Cross Language Information Retrieval
, 2003
"... Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym opera ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries – one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.
A Framework for Fast Incremental Interpretation during Speech Decoding
"... This paper describes a framework for incorporating referential semantic information from a world model or ontology directly into a probabilistic language model of the sort commonly used in speech recognition, where it can be probabilistically weighted together with phonological and syntactic factors ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper describes a framework for incorporating referential semantic information from a world model or ontology directly into a probabilistic language model of the sort commonly used in speech recognition, where it can be probabilistically weighted together with phonological and syntactic factors as an integral part of the decoding process. Introducing world model referents into the decoding search greatly increases the search space, but by using a single integrated phonological, syntactic, and referential semantic language model, the decoder is able to incrementally prune this search based on probabilities associated with these combined contexts. The result is a single unified referential semantic probability model which brings several kinds of context to bear in speech decoding, and performs accurate recognition in real time on large domains in the absence of example in-domain training sentences. 1
Hidden Model Sequence Models for Automatic Speech Recognition
, 2001
"... Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In m ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In many cases the pronunciation model operates on a phoneme level and is derived independently of the underlying models. In contrast, this work is aimed at improving pronunciation modelling on a sub-phone level in a combined framework. The modelling of pronunciation variation is assumed to be of special importance for recognition of spontaneous speech.
On-Line Handwriting Recognition with Constrained N-Best Decoding
- In Proc. 13th ICPR, volume C
, 1996
"... It is well known that N -best decoding for speech recognition coupled with post-processing can provide significant accuracy advantages. We have implemented and experimented with N -best decoding for handwriting recognition, using an N -best decoding algorithm that employs a synchronous forward pass ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
It is well known that N -best decoding for speech recognition coupled with post-processing can provide significant accuracy advantages. We have implemented and experimented with N -best decoding for handwriting recognition, using an N -best decoding algorithm that employs a synchronous forward pass and an asynchronous backward pass. One novel aspect of our algorithm is the use of pruning in the backward pass to constrain the search to candidates whose likelihood score is within a threshold specified using the likelihood score of the best candidate. We show that this algorithm is more efficient than traditional N -best decoding algorithms. A two-stage method is introduced in which the language model changes from a relaxed model during the N -best search to a more constrained model for rescoring in a second pass. This method reduces the computation needed for more detailed pattern matching by preselecting the N -best most likely candidates. 1. Introduction For most stochastic pattern r...
A Surficial Pronunciation Model
- In: Proc. of the ESCA Workshop ‘Modeling Pronunciation Variation for Automatic Speech Recognition’ (see [87
, 1998
"... We argue for a surficial pronunciation model: a model without underlying forms. The surficial model outperforms a traditional generative model by a significant margin on conversational speech (Switchboard) as well as on read speech (TIMIT). Our results suggest that the true mapping from underlying f ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We argue for a surficial pronunciation model: a model without underlying forms. The surficial model outperforms a traditional generative model by a significant margin on conversational speech (Switchboard) as well as on read speech (TIMIT). Our results suggest that the true mapping from underlying forms to surface forms is too complex to be accurately modeled using current techniques, and that we would be best served to model the surface forms directly.
Unification-Based Glossing
- In Proceedings of the International Joint Conference on Artificial Intelligence
, 1995
"... We present an approach to syntax-based machine translation that combines unification-style interpretation with statistical processing. This approach enables us to translate any Japanese newspaper article into English, with quality far better than a word-for-word translation. Novel ideas include the ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present an approach to syntax-based machine translation that combines unification-style interpretation with statistical processing. This approach enables us to translate any Japanese newspaper article into English, with quality far better than a word-for-word translation. Novel ideas include the use of feature structures to encode word lattices and the use of unification to compose and manipulate lattices. Unification also allows us to specify abstract features that delay target-language synthesis until enough source-language information is assembled. Our statistical component enables us to search efficiently among competing translations and locate those with high English fluency. 1 Background JAPANGLOSS [ Knight et al., 1994; 1995 ] is a project whose goals are to scale up knowledge-based machine translation (KBMT) techniques to handle JapaneseEnglish newspaper MT, to achieve higher quality output than is currently available, and to develop techniques for rapidly constructing MT ...
Robust Incremental Parsing using Human-Like Memory Constraints
"... Psycholinguistic studies suggest a model of human language processing that 1) performs incremental interpretation of spoken utterances or written text, 2) preserves ambiguity by maintaining competing analyses in parallel, and 3) operates within a severely constrained short-term memory store – possib ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Psycholinguistic studies suggest a model of human language processing that 1) performs incremental interpretation of spoken utterances or written text, 2) preserves ambiguity by maintaining competing analyses in parallel, and 3) operates within a severely constrained short-term memory store – possibly constrained to as few as four distinct elements. The first two observations are sometimes taken to endorse a probabilistic beam-search model, similar to some existing systems. But the last observation imposes a restriction that has not until now been evaluated in a corpus study. This paper first describes a relatively simple statistical model of incremental language processing that meets all three of the above desiderata. The paper then evaluates the coverage of an implementation of this model, and the accuracy with which this implementation can analyze a large syntactically-annotated corpus of English. 1
A Statistical Information Extraction System for Turkish
, 2000
"... This thesis presents the results of a study on information extraction from unrestricted Turkish text using statistical language processing methods. We have successfully applied statistical methods using both the lexical and morphological information to the following tasks: The Turkish Text Deasciifi ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This thesis presents the results of a study on information extraction from unrestricted Turkish text using statistical language processing methods. We have successfully applied statistical methods using both the lexical and morphological information to the following tasks: The Turkish Text Deasciifier task aims to convert the ASCII characters in a Turkish text, into the corresponding non-ASCII Turkish characters (i.e., "fi", ";5", "g", "", "", '5", and their upper cases).

