Results 1 -
7 of
7
Towards language independent acoustic modeling
- In Proc. ICASSP
, 2000
"... We describe procedures and experimental results using speech from diverse source languages to build an ASR system for a single target language. This work is intended to improve ASR in languages for which large amounts of training data are not available. We have developed both knowledge based and aut ..."
Abstract
-
Cited by 22 (8 self)
- Add to MetaCart
We describe procedures and experimental results using speech from diverse source languages to build an ASR system for a single target language. This work is intended to improve ASR in languages for which large amounts of training data are not available. We have developed both knowledge based and automatic methods to map phonetic units from the source languages to the target language. We employed HMM adaptation techniques and Discriminative Model Combination to combine acoustic models from the individual source languages for recognition of speech in the target language. Experiments are described in which Czech Broadcast News is transcribed using acoustic models trained from small amounts of Czech read speech augmented by English, Spanish, Russian, and Mandarin acoustic models.
Applying Semantic Classes in Event Detection and Tracking
- In: Proc. International Conference on Natural Language Processing (ICON'02
, 2002
"... Event detection and tracking is a somewhat recent area of information retrieval research. The detection is about spotting new, previously unreported real-life events from online news-feed, while the tracking assigns documents to previously spotted events. We propose a new vector model consisting of ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Event detection and tracking is a somewhat recent area of information retrieval research. The detection is about spotting new, previously unreported real-life events from online news-feed, while the tracking assigns documents to previously spotted events. We propose a new vector model consisting of four semantic classes from the documents: locations, proper names, temporal expressions and normal terms that are stored in designated subvectors. We also propose a new similarity measure based on utilizing semantic classes. Moreover, due to the vagueness of the concept of event, we run our experiments with several different definitions.
Phonetic-Distance-Based Hypothesis Driven Lexical Adaptation for Transcribing Multilingual Broadcast News
- In ICSLP’98
"... High out-of-vocabulary (OOV) rates are one of the most prevailing problems for languages with a rapid vocabulary growth due to a large number of inflections. Especially when transcribing Serbo-Croatian and German broadcast news, the OOV-rate is between 8.7 % and 4.5%. Hypothesis Driven Lexical Adapt ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
High out-of-vocabulary (OOV) rates are one of the most prevailing problems for languages with a rapid vocabulary growth due to a large number of inflections. Especially when transcribing Serbo-Croatian and German broadcast news, the OOV-rate is between 8.7 % and 4.5%. Hypothesis Driven Lexical Adaptation (HDLA) has already been shown to decrease high OOV-rates significantly by using morphology-based linguistic knowledge. This paper introduces another approach to dynamically adapt a recognition lexicon to the utterance to be recognized. Instead of morphological knowledge about word stems and inflection endings, distance measures based on Levenstein distance are used. Results based on phoneme and grapheme distances will be presented. Compared to the use of morphological knowledge, our distance-based approach offers the distinct advantage that no expert knowledge about a specific language is required, no definition of complex grammar rules is necessary. Instead, grapheme sequences or the phoneme representation of words are sufficient to apply our HDLA algorithm easily to any new language. With our proposed technique we were able to decrease OOV-rates by more than half from 8.7 % to 4%, thereby also improving recognition performance by an absolute 4.1 % from 29.5 % to 25.4 % word error rate. 1.
Multi-Lingual Informedia: A Demonstration Of Speech Recognition And Information Retrieval Across Multiple Languages
- Information Retrieval Across Multiple Languages, BNTUW-98 Proc. Of DARPA Workshop on Broadcast News Understanding Systems
, 1998
"... The Multilingual Informedia Project demonstrates a seamless extension of the Informedia approach to search and discovery across video documents in multiple languages. Previously, we successfully demonstrated that current speech recognizers allow accurate information retrieval for automatically proce ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
The Multilingual Informedia Project demonstrates a seamless extension of the Informedia approach to search and discovery across video documents in multiple languages. Previously, we successfully demonstrated that current speech recognizers allow accurate information retrieval for automatically processed English news TV broadcasts. The new system performs speech recognition on foreign language news broadcasts, segments it into stories and indexes the foreign data together with existing English news data. This first multi-lingual prototype could easily be extended to other languages. 1. INTRODUCTION The Informedia [5] project's goal is to allow search and retrieval in the video medium, similar to what is available today for text only. To enable this access to video, speech recognition is used to provide a text transcript for the audio track, image processing determines scene boundaries, recognizes faces and allows for image similarity comparisons. Everything is indexed into a searchable...
New Directions in Video Information Extraction and Summarization
- In Proceedings of the 10 th DELOS Workshop
, 1999
"... The Informedia Digital Video Library project provided a technological foundation for full content indexing and retrieval of video and audio media. New directions for this research extend to: (1) search and retrieval in multilingual video corpora, (2) analysis and indexing of continuously captured, u ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The Informedia Digital Video Library project provided a technological foundation for full content indexing and retrieval of video and audio media. New directions for this research extend to: (1) search and retrieval in multilingual video corpora, (2) analysis and indexing of continuously captured, unstructed and unedited fieldcollected video, and (3) summarization of video-based content across multiple stories based on the user's perspective. Informedia Digital Video Library Foundation Work The Informedia Digital Video Library focused on the development and integration of technologies for information extraction from video and audio content to enable its full content search and retrieval. Over a terabyte (1600 hours, 4,000 segments) of online data was collected, with automatically generated metadata and indices for retrieving video segments from this library. Informedia successfully pioneered the automatic creation of multimedia abstractions, demonstrated empirical proofs of their rel...

