Results 1 -
4 of
4
Web-Based Language Modelling for Automatic Lecture Transcription
"... Universities have long relied on written text to share knowledge. As more lectures are made available on-line, these must be accompanied by textual transcripts in order to provide the same access to information as textbooks. While Automatic Speech Recognition (ASR) is a cost-effective method to deli ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Universities have long relied on written text to share knowledge. As more lectures are made available on-line, these must be accompanied by textual transcripts in order to provide the same access to information as textbooks. While Automatic Speech Recognition (ASR) is a cost-effective method to deliver transcriptions, its accuracy for lectures is not yet satisfactory. One approach for improving lecture ASR is to build smaller, topic-dependent Language Models (LMs) and combine them (through LM interpolation or hypothesis space combination) with general-purpose, large-vocabulary LMs. In this paper, we propose a simple solution for lecture ASR with similar or better Word Error Rate reductions (as well as topic-specific keyword identification accuracies) than combination-based approaches. Our method eliminates the need for two types of LMs by exploiting the lecture slides to collect a web corpus appropriate for modelling both the conversational and the topic-specific styles of lectures. Index Terms: speech recognition, language modelling, corpus building, topic dependent, lecture transcription.
RESAMPLING AUXILIARY DATA FOR LANGUAGE MODEL ADAPTATION IN MACHINE TRANSLATION FOR SPEECH
"... Performance of n-gram language models depends to a large extent on the amount of training text material available for building the models and the degree to which this text matches the domain of interest. The language modeling community is showing a growing interest in using large collections of auxi ..."
Abstract
- Add to MetaCart
Performance of n-gram language models depends to a large extent on the amount of training text material available for building the models and the degree to which this text matches the domain of interest. The language modeling community is showing a growing interest in using large collections of auxiliary textual material to supplement sparse in-domain resources. One of the problems in using such auxiliary corpora is that they may differ significantly from the specific nature of the domain of interest. In this paper, we propose three different methods for adapting language models for a Speech to Speech (S2S) translation system when auxiliary corpora are of different genre and domain. The proposed methods are based on centroid similarity, n-gram ratios and resampled language models. We show how these methods can be used to select out of domain textual data such as newswire text to improve a S2S system. We were able to achieve an overall relative improvement of 3.8 % in BLEU score over a baseline system that uses only in-domain conversational data.
Focusing on Novelty: A Crawling Strategy to Build Diverse Language Models ABSTRACT
"... Word prediction performed by language models has an important role in many tasks as e.g. word sense disambiguation, speech recognition, hand-writing recognition, query spelling and query segmentation. Recent research has exploited the textual content of the Web to create language models. In this pap ..."
Abstract
- Add to MetaCart
Word prediction performed by language models has an important role in many tasks as e.g. word sense disambiguation, speech recognition, hand-writing recognition, query spelling and query segmentation. Recent research has exploited the textual content of the Web to create language models. In this paper, we propose a new focused crawling strategy to collect Web pages that focuses on novelty in order to create diverse language models. In each crawling cycle, the crawler tries to fill the gaps present in the current language model built from previous cycles, by avoiding visiting pages whose vocabulary is already well represented in the model. It relies on an information theoretic measure to identify these gaps and then learns link patterns to pages in these regions in order to guide its visitation policy. To handle constantly evolving domains, a key feature of our crawler approach is its ability to adjust its focus as the crawl progresses. We evaluate our approach in two different scenarios in which our solution can be useful. First, we demonstrate that our approach produces more effective language models than the ones created by a baseline crawler in the context of a speech recognition task of broadcast news. In fact, in some cases, our crawler was able to obtain similar results to the baseline by crawling only 12.5 % of the pages collected by the latter. Secondly, since in the news domain avoiding well-represented content might lead to novelty, i.e. up-todate pages, we show that our diversity-based crawler can also be helpful to guide the crawler for the most recent content in the news. The results show that our approach was able to obtain on average 50 % more up-to-date pages than the baseline crawler.
SpeechForms: From Web to Speech and Back
"... This paper describes SpeechForms, a system that uses novel techniques to automatically identify form element semantics and form element content, and to semi-automatically generate language models that allow users to fill out each web form element by voice. Preliminary experimental results show that ..."
Abstract
- Add to MetaCart
This paper describes SpeechForms, a system that uses novel techniques to automatically identify form element semantics and form element content, and to semi-automatically generate language models that allow users to fill out each web form element by voice. Preliminary experimental results show that simple per-element language models are faster and may be more accurate than statistical n-gram language models trained on large amounts of web text data. Index Terms: language modeling, form understanding, information retrieval

