Results 11 - 20
of
194
Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner
- in Proc. Eurospeech
, 2003
"... We study continuous speech recognition based on sub-word units found in an unsupervised fashion. For agglutinative languages like Finnish, traditional word-based n-gram language modeling does not work well due to the huge number of different word forms. We use a method based on the Minimum Descripti ..."
Abstract
-
Cited by 43 (20 self)
- Add to MetaCart
We study continuous speech recognition based on sub-word units found in an unsupervised fashion. For agglutinative languages like Finnish, traditional word-based n-gram language modeling does not work well due to the huge number of different word forms. We use a method based on the Minimum Description Length principle to split words statistically into subword units allowing efficient language modeling and unlimited vocabulary. The perplexity and speech recognition experiments on Finnish speech data show that the resulting model outperforms both word and syllable based trigram models. Compared to the word trigram model, the out-of-vocabulary rate is reduced from 20 % to 0 % and the word error rate from 56 % to 32%. 1.
Global inference for sentence compression: An integer linear programming approach
- Journal of Artificial Intelligence Research (JAIR
, 2008
"... Sentence compression holds promise for many applications ranging from summarization to subtitle generation. Our work views sentence compression as an optimization problem and uses integer linear programming (ILP) to infer globally optimal compressions in the presence of linguistically motivated cons ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
Sentence compression holds promise for many applications ranging from summarization to subtitle generation. Our work views sentence compression as an optimization problem and uses integer linear programming (ILP) to infer globally optimal compressions in the presence of linguistically motivated constraints. We show how previous formulations of sentence compression can be recast as ILPs and extend these models with novel global constraints. Experimental results on written and spoken texts demonstrate improvements over state-of-the-art models. 1.
Lightly Supervised and Unsupervised Acoustic Model Training
- Computer Speech and Language
, 2002
"... The last decade has witnessed substantial progress in speech recognition technology, with todays state-of-the-art systems being able to transcribe unrestricted broadcast news audio data with a word error of about 20%. ..."
Abstract
-
Cited by 34 (2 self)
- Add to MetaCart
The last decade has witnessed substantial progress in speech recognition technology, with todays state-of-the-art systems being able to transcribe unrestricted broadcast news audio data with a word error of about 20%.
Learning User Simulations for Information State Update Dialogue Systems
- in Eurospeech
, 2005
"... This paper describes and compares two methods for simulating user behaviour in spoken dialogue systems. User simulations are important for automatic dialogue strategy learning and the evaluation of competing strategies. Our methods are designed for use with "Information State Update" (ISU)-based dia ..."
Abstract
-
Cited by 34 (11 self)
- Add to MetaCart
This paper describes and compares two methods for simulating user behaviour in spoken dialogue systems. User simulations are important for automatic dialogue strategy learning and the evaluation of competing strategies. Our methods are designed for use with "Information State Update" (ISU)-based dialogue systems. The first method is based on supervised learning using linear feature combination and a normalised exponential output function. The user is modelled as a stochastic process which selects user actions ( pairs) based on features of the current dialogue state, which encodes the whole history of the dialogue. The second method uses n-grams of speech act, task pairs, restricting the length of the history considered by the order of the n-gram. Both models were trained and evaluated on a subset of the COMMUNICATOR corpus, to which we added annotations for user actions and Information States. The model based on linear feature combination has a perplexity of 2.08 whereas the best n-gram (4-gram) has a perplexity of 3.58. Each one of the user models ran against a system policy trained on the same corpus with a method similar to the one used for our linear feature combination model. The quality of the simulated dialogues produced was then measured as a function of the filled slots, confirmed slots, and number of actions performed by the system in each dialogue. In this experiment both the linear feature combination model and the best n-grams (5-gram and 4-gram) produced similar quality simulated dialogues.
Collages as Dynamic Summaries for News Video
- DIGITAL VIDEO SUMMARIES, INDEXING AND RETRIEVAL
, 2002
"... This paper introduces the video collage, a novel effective interface for browsing and interpreting video collections. The paper discusses how collages are automatically produced, illustrates their use, and evaluates their effectiveness as summaries across news stories. Collages are presentations ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
This paper introduces the video collage, a novel effective interface for browsing and interpreting video collections. The paper discusses how collages are automatically produced, illustrates their use, and evaluates their effectiveness as summaries across news stories. Collages are presentations of text and images derived from multiple video sources, which provide an interactive visualization for a set of video documents, summarizing their contents and providing a navigation aid for further exploration. The dynamic creation of collages is based on user context, e.g., an originating query, coupled with automatic processing to refine the candidate imagery. Named entity identification and common phrase extraction provides descriptive text. The dynamic manipulation of collages allows user-directed browsing and reveals additional detail. The utility of collages as summaries is examined with respect to other published news summaries.
An Empirical Verification of Coverage and Correctness for a General-Purpose Sentence Generator
, 1998
"... This paper describes a general-purpose sentence generation system that can achieve both broad scale coverage and high quality while aiming to be suitable for a variety of generation tasks. We measure the coverage and correctness empirically using a section of the Penn Treebank corpus as a tes ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
This paper describes a general-purpose sentence generation system that can achieve both broad scale coverage and high quality while aiming to be suitable for a variety of generation tasks. We measure the coverage and correctness empirically using a section of the Penn Treebank corpus as a test set. We also describe novel features that help make the generator flexible and easier to use for a variety of tasks. To our knowledge, this is the first empirical measurement of coverage reported in the literature, and the highest reported measurements of correctness.
Improving Statistical MT through Morphological Analysis
- In Proc. of Empirical Methods in Natural Language Processing (EMNLP
, 2005
"... In statistical machine translation, estimating word-to-word alignment probabilities for the translation model can be difficult due to the problem of sparse data: most words in a given corpus occur at most a handful of times. With a highly inflected language such as Czech, this problem can be particu ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
In statistical machine translation, estimating word-to-word alignment probabilities for the translation model can be difficult due to the problem of sparse data: most words in a given corpus occur at most a handful of times. With a highly inflected language such as Czech, this problem can be particularly severe. In addition, much of the morphological variation seen in Czech words is not reflected in either the morphology or syntax of a language like English. In this work, we show that using morphological analysis to modify the Czech input can improve a Czech-English machine translation system. We investigate several different methods of incorporating morphological information, and show that a system that combines these methods yields the best results. Our final system achieves a BLEU score of.333, as compared to.270 for the baseline word-to-word system. 1
Comparison Of Part-Of-Speech And Automatically Derived Category-Based Language Models For Speech Recognition
- Proc. ICASSP’98
, 1998
"... This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation. Categories corresponding to parts-of-speech as well as automatically clustered groupings are considered. The category-based model employs variable-length n- ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation. Categories corresponding to parts-of-speech as well as automatically clustered groupings are considered. The category-based model employs variable-length n-grams and permits each word to belong to multiple categories. Relative word error rate reductions of between 2 and 7 % over the baseline are achieved in N-best rescoring experiments on the Wall Street Journal corpus. The largest improvement is obtained with a model using automatically determined categories. Perplexities continue to decrease as the number of different categories is increased, but improvements in the word error rate reach an optimum. 1. INTRODUCTION Language models based on n-grams of word-categories 1 are intrinsically able to generalise to unseen word sequences, and hence offer improved robustness to novel or rare word combinations. In isolation, such models represent a c...
Connectionist speech recognition of Broadcast News
, 2002
"... This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to post ..."
Abstract
-
Cited by 28 (10 self)
- Add to MetaCart
This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to posterior probabilities has enabled us to develop a number of novel approaches to confidence estimation, pronunciation modelling and search. In addition we have investigated a new feature extraction technique based on the modulation-filtered spectrogram (MSG), and methods for combining multiple information sources. We have incorporated all of these techniques into a system for the transcription
2006b. Models for sentence compression: A comparison across domains, training requirements and evaluation measures
- In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
"... Sentence compression is the task of producing a summary at the sentence level. This paper focuses on three aspects of this task which have not received detailed treatment in the literature: training requirements, scalability, and automatic evaluation. We provide a novel comparison between a supervis ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
Sentence compression is the task of producing a summary at the sentence level. This paper focuses on three aspects of this task which have not received detailed treatment in the literature: training requirements, scalability, and automatic evaluation. We provide a novel comparison between a supervised constituentbased and an weakly supervised wordbased compression algorithm and examine how these models port to different domains (written vs. spoken text). To achieve this, a human-authored compression corpus has been created and our study highlights potential problems with the automatically gathered compression corpora currently used. Finally, we assess whether automatic evaluation measures can be used to determine compression quality. 1

