Results 1 -
4 of
4
Unary Data Structures for Language Models
"... Language models are important components of speech recognition and machine translation systems. Trained on billions of words, and consisting of billions of parameters, language models often are the single largest components of these systems. There have been many proposed techniques to reduce the sto ..."
Abstract
- Add to MetaCart
Language models are important components of speech recognition and machine translation systems. Trained on billions of words, and consisting of billions of parameters, language models often are the single largest components of these systems. There have been many proposed techniques to reduce the storage requirements for language models. A technique based upon pointer-free compact storage of ordinal trees shows compression competitive with the best proposed systems, while retaining the full finite state structure, and without using computationally expensive block compression schemes or lossy quantization techniques. Index Terms: n-gram language models, unary data structures 1.
SpeechForms: From Web to Speech and Back
"... This paper describes SpeechForms, a system that uses novel techniques to automatically identify form element semantics and form element content, and to semi-automatically generate language models that allow users to fill out each web form element by voice. Preliminary experimental results show that ..."
Abstract
- Add to MetaCart
This paper describes SpeechForms, a system that uses novel techniques to automatically identify form element semantics and form element content, and to semi-automatically generate language models that allow users to fill out each web form element by voice. Preliminary experimental results show that simple per-element language models are faster and may be more accurate than statistical n-gram language models trained on large amounts of web text data. Index Terms: language modeling, form understanding, information retrieval
Estimating Word-Stability During Incremental Speech Recognition
"... Many speech user interfaces can be improved by incrementally displaying or interpreting a speech recognizer’s current best path as a user speaks. This gives rise to a problem of instability, whereby the best path may change frequently, particularly with respect to the words most recently spoken. Int ..."
Abstract
- Add to MetaCart
Many speech user interfaces can be improved by incrementally displaying or interpreting a speech recognizer’s current best path as a user speaks. This gives rise to a problem of instability, whereby the best path may change frequently, particularly with respect to the words most recently spoken. Introducing a lag between the audio most recently processed and the portion of the best path shown to the user can lead to more usable incremental results. In the ideal case, the lag introduced would vary to recover exactly the longest stable prefix of the best path. In this paper, we introduce a framework for estimating a stability statistic for each word, and explore the tradeoff of stability and lag by thresholding stability statistics estimated using a variety of features. 1.
Voice Query Refinement
"... We describe a system for the refinement of spoken search queries. Given an initial query (Northern Italian restaurants in New York), instead of requiring a fully-specified followup query (Korean restaurants in New York), a more natural, abbreviated update query (Korean instead) may be spoken. The sy ..."
Abstract
- Add to MetaCart
We describe a system for the refinement of spoken search queries. Given an initial query (Northern Italian restaurants in New York), instead of requiring a fully-specified followup query (Korean restaurants in New York), a more natural, abbreviated update query (Korean instead) may be spoken. The system consists of a parsing step to identify the type and arguments of the refinement, a candidate generation step to enumerate the possible refinements, and a model classification step to select the best refinement. We present results on test query refinements given both to this system and to human judges that show the automated system outperforms the human judges on that data set. Index terms: spoken dialog systems, voice search, query refinement 1.

