Results 1 -
4 of
4
A generalized composition algorithm for weighted finite-state transducers
- in Proc. of Interspeech, 2009
"... This paper describes a weighted finite-state transducer composition algorithm that generalizes the concept of the composition filter and presents filters that remove useless epsilon paths and push forward labels and weights along epsilon paths. This filtering permits the compostion of large speech r ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
This paper describes a weighted finite-state transducer composition algorithm that generalizes the concept of the composition filter and presents filters that remove useless epsilon paths and push forward labels and weights along epsilon paths. This filtering permits the compostion of large speech recognition contextdependent lexicons and language models much more efficiently in time and space than previously possible. We present experiments on Broadcast News and a spoken query task that demonstrate an ∼5 % to 10 % overhead for dynamic, runtime composition compared to a static, offline composition of the recognition transducer. To our knowledge, this is the first such system with so little overhead.
AUTOMATIC VS. MANUAL TOPIC SEGMENTATION AND INDEXATION IN BROADCAST NEWS
"... This paper describes the latest progress in our work on Broadcast News for European Portuguese. The central modules of our media watch system that matches the topic of each news story against the user preferences registered in the system are: audio pre-processing, speech recognition and topic segmen ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper describes the latest progress in our work on Broadcast News for European Portuguese. The central modules of our media watch system that matches the topic of each news story against the user preferences registered in the system are: audio pre-processing, speech recognition and topic segmentation and indexation. The main focus of the paper is on the impact of the errors made by the earlier modules in the last ones. This impact is in our opinion an essential diagnostic tool for the improvement of the overall pipeline system. 1.
Potential scope of a fully-integrated architecture for speech translation
"... The classical approach to tackle speech translation assembles a text-to-text translation system placed after a speech recogniser, yielding the so-called decoupled architecture. In this regard, there are two issues to bear in mind: first, what is translated in the decoupled architecture is the most l ..."
Abstract
- Add to MetaCart
The classical approach to tackle speech translation assembles a text-to-text translation system placed after a speech recogniser, yielding the so-called decoupled architecture. In this regard, there are two issues to bear in mind: first, what is translated in the decoupled architecture is the most likely transcription of the spoken utterance; second, translation systems are sensitive to errors in the source string, and speech recognition systems are still far from being flawless. In this paper we promote the use of an architecture to carry out speech translation that allows to build up the most likely translation relying upon both acoustic and translation models in a cooperative manner, that is the so-called integrated architecture. The integrated architecture is implemented in the finite-state framework by virtue of the composition of finite-state acoustic models of the source language within a stochastic finite-state transducer that would encompass source and target languages. The potential performance of the integrated architecture is assessed quantitatively in relation to the decoupled one. We conclude that while the single-best approach for both decoupled and integrated architectures show similar performance, an oracle evaluation reveals that the potential scope of the integrated architecture would offer statistically significant differences. c ○ 2010 European Association for Machine Translation. 1 Statistical speech translation The goal of statistical speech translation is to seek the most likely string in the target language, ̂t, given the acoustic representation of a speech signal in the source language, x. ̂t = argmax P (t|x) (1) t The source string, s, that is the transcription of the speech utterance x, can be introduced as a hidden variable the Bayes ’ decision rule applied (Ney,
Filters for Efficient Composition of Weighted
"... Abstract. This paper describes a weighted finite-state transducer composition algorithm that generalizes the concept of the composition filter and presents various filters that process epsilon transitions, lookahead along paths, and push forward labels along epsilon paths. These filters, either indi ..."
Abstract
- Add to MetaCart
Abstract. This paper describes a weighted finite-state transducer composition algorithm that generalizes the concept of the composition filter and presents various filters that process epsilon transitions, lookahead along paths, and push forward labels along epsilon paths. These filters, either individually or in combination, make it possible to compose some transducers much more efficiently in time and space than otherwise possible. We present examples of this drawn, in part, from demanding speech-processing applications. The generalized composition algorithm and many of these filters have been included in OpenFst, an open-source weighted transducer library. 1

