Results 1 -
3 of
3
The LIMSI Broadcast News Transcription System
- Speech Communication
, 2002
"... This paper reports on activites at LIMSI over the last few years directed at the transcription of broadcast news data. We describe our development work in moving from laboratory read speech data to real-world or `found' speech data in preparation for the ARPA Nov96, Nov97 and Nov98 evaluations. T ..."
Abstract
-
Cited by 84 (5 self)
- Add to MetaCart
This paper reports on activites at LIMSI over the last few years directed at the transcription of broadcast news data. We describe our development work in moving from laboratory read speech data to real-world or `found' speech data in preparation for the ARPA Nov96, Nov97 and Nov98 evaluations. Two main problems needed to be addressed to deal with the continuous flow of inhomogenous data. These concern the varied acoustic nature of the signal (signal quality, environmental and transmission noise, music) and different linguistic styles (prepared and spontaneous speech on a wide range of topics, spoken by a large variety of speakers).
Improved Topic-Dependent Language Modeling Using Information Retrieval Techniques
- in ICASSP
, 1999
"... N-gram language models are frequently used by the speech recognition systems to constrain and guide the search. N-gram models use only the last N-1 words to predict the next word. Typical values of N that are used range from 2-4. N-gram language models thus lack the long-term context information. We ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
N-gram language models are frequently used by the speech recognition systems to constrain and guide the search. N-gram models use only the last N-1 words to predict the next word. Typical values of N that are used range from 2-4. N-gram language models thus lack the long-term context information. We show that the predictive power of the N-gram language models can be improved by using long-term context information about the topic of discussion. We use information retrieval techniques to generalize the available context information for topic-dependent language modeling. We demonstrate the effectiveness of this technique by performing experiments on the Wall Street Journal text corpus, which is a relatively difficult task for topic-dependent language modeling since the text is relatively homogeneous. The proposed method can reduce the perplexity of the baseline language model by 37%, indicating the predictive power of the topic-dependent language model. 1.
Knowledge Management and Speech Recognition
- Computer. April
, 2002
"... This article is about how speech recognition technologies are related to knowledge management, and about the likely impact those technologies might have on KM. I will first describe what "knowledge management" means in this context, including listing several KM applications that might be impacted by ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This article is about how speech recognition technologies are related to knowledge management, and about the likely impact those technologies might have on KM. I will first describe what "knowledge management" means in this context, including listing several KM applications that might be impacted by speech recognition. I will argue that speech recognition will currently be most useful when the items being processed are not too short, and will highlight several of the open problems that remain--- not the least of which is improving the quality of speech recognition for telephone conversations! I will conclude by briefly describing some KM-related technologies where speech recognition has been successful. The message I hope to convey is that speech recognition has good prospects in "KM as information technology," but that there is sufficient weakness in the recognition technology to view those prospects cautiously. What is Knowledge Management? Knowledge Management (KM)

