Results 1 - 10
of
14
The Design Principles of a Weighted Finite-State Transducer Library
- THEORETICAL COMPUTER SCIENCE
, 2000
"... We describe the algorithmic and software design principles of an object-oriented library for weighted finite-state transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive eff ..."
Abstract
-
Cited by 82 (19 self)
- Add to MetaCart
We describe the algorithmic and software design principles of an object-oriented library for weighted finite-state transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive efficiency in demanding speech processing applications involving weighted automata of more than 10^7 states and transitions. Besides its mathematical foundation, the design also draws from important ideas in algorithm design and programming languages: dynamic programming and shortest-paths algorithms over general semirings, object-oriented programming, lazy evaluation and memoization.
A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation
, 2003
"... We present a derivation of the alignment template model for statistical machine translation and an implementation of the model using weighted finite state transducers. The approach we describe allows us to implement each constituent distribution of the model as a weighted finite state transduc ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
We present a derivation of the alignment template model for statistical machine translation and an implementation of the model using weighted finite state transducers. The approach we describe allows us to implement each constituent distribution of the model as a weighted finite state transducer or acceptor. We show that bitext word alignment and translation under the model can be performed with standard FSM operations involving these transducers.
Segmental minimum Bayes-risk decoding for automatic speech recognition
- IEEE Transactions on Speech and Audio Processing
, 2003
"... Abstract—Minimum Bayes-Risk (MBR) speech recognizers have been shown to yield improvements over the conventional maximum a-posteriori probability (MAP) decoders through N-best list rescoring and search over word lattices. We present a Segmental Minimum Bayes-Risk decoding (SMBR) framework that simpl ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
Abstract—Minimum Bayes-Risk (MBR) speech recognizers have been shown to yield improvements over the conventional maximum a-posteriori probability (MAP) decoders through N-best list rescoring and search over word lattices. We present a Segmental Minimum Bayes-Risk decoding (SMBR) framework that simplifies the implementation of MBR recognizers through the segmentation of the N-best lists or lattices over which the recognition is to be performed. This paper presents lattice cutting procedures that underly SMBR decoding. Two of these procedures are based on a risk minimization criterion while a third one is guided by word-level confidence scores. In conjunction with SMBR decoding, these lattice segmentation procedures give consistent improvements in recognition word error rate (WER) on the Switchboard corpus. We also discuss an application of risk-based lattice cutting to multiple-system SMBR decoding and show that it is related to other system combination techniques such as ROVER. This strategy combines lattices produced from multiple ASR systems and is found to give WER improvements in a Switchboard evaluation system. Index Terms—ASR system combination, extended-ROVER, lattice cutting, minimum Bayes-risk decoding, segmental minimum
Lattice Segmentation and Minimum Bayes Risk Discriminative Training for Large . . .
- IN PROC. EUROSPEECH
, 2005
"... Lattice segmentation techniques developed for Minimum Bayes Risk decoding in large vocabulary speech recognition tasks are used to compute the statistics for discriminative training algorithms that estimate HMM parameters so as to reduce the overall risk over the training data. New estimation proced ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Lattice segmentation techniques developed for Minimum Bayes Risk decoding in large vocabulary speech recognition tasks are used to compute the statistics for discriminative training algorithms that estimate HMM parameters so as to reduce the overall risk over the training data. New estimation procedures are developed and evaluated for small vocabulary and large vocabulary recognition tasks, and additive performance improvements are shown relative to maximum mutual information estimation. These relative gains are explained through a detailed analysis of individual word recognition errors.
Statistical Machine Translation Using Coercive Two-Level Syntactic Transduction
- IN PROCEEDINGS OF EMNLP
, 2003
"... We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena w ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena without requiring a full parse in either language. We also examine aspects of lexical transfer, suggesting and exploring a concept of translation coercion across parts of speech, as well as a transfer model based on lemma-to-lemma translation probabilities, which holds promise for improving machine translation of low-density languages. Experiments are performed in both Arabic-to-English and French-to-English translation demonstrating the efficacy of the proposed techniques. Performance is automatically evaluated via the Bleu score metric.
Risk Based Lattice Cutting For Segmental Minimum Bayes-Risk Decoding
- in ICSLP
, 2002
"... Minimum Bayes-Risk (MBR) speech recognizers have been shown to give improvements over the conventional maximum a-posteriori probability (MAP) decoders through N-best list rescoring and search over word lattices. Segmental MBR (SMBR) decoders simplify the implementation of MBR recognizers by segment ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
Minimum Bayes-Risk (MBR) speech recognizers have been shown to give improvements over the conventional maximum a-posteriori probability (MAP) decoders through N-best list rescoring and search over word lattices. Segmental MBR (SMBR) decoders simplify the implementation of MBR recognizers by segmenting the N-best lists or lattices over which the recognition is performed. We present a lattice cutting procedure that attempts to minimize the total Bayes-Risk of all word strings in the segmented lattice. We provide experimental results on the Switchboard conversational speech corpus showing that this segmentation procedure, in conjunction with SMBR decoding, gives modest but significant improvements over MAP decoders as well as MBR decoders on unsegmented lattices.
The Johns Hopkins University 2003 Chinese-English machine translation system
- In Proceedings of the MT Summit IX
, 2003
"... We describe a Chinese to English Machine Translation system developed at the Johns Hopkins University for the NIST 2003 MT evaluation. The system is based on a Weighted Finite State Transducer implementation of the alignment template translation model for statistical machine translation. The basel ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We describe a Chinese to English Machine Translation system developed at the Johns Hopkins University for the NIST 2003 MT evaluation. The system is based on a Weighted Finite State Transducer implementation of the alignment template translation model for statistical machine translation. The baseline MT system was trained using 100,000 sentence pairs selected from a static bitext training collection. Information retrieval techniques were then used to create specific training collections for each document to be translated. This document-specific training set included bitext and name entities that were then added to the baseline system by augmenting the library of alignment templates. We report translation performance of baseline and IR-based systems on two NIST MT evaluation test sets.
Models for Inuktitut-English Word Alignment
, 2005
"... This paper presents a set of techniques for bitext word alignment, optimized for a language pair with the characteristics of Inuktitut-English. The resulting systems exploit cross-lingual affinities at the sublexical level of syllables and substrings, as well as regular patterns of transliteration a ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper presents a set of techniques for bitext word alignment, optimized for a language pair with the characteristics of Inuktitut-English. The resulting systems exploit cross-lingual affinities at the sublexical level of syllables and substrings, as well as regular patterns of transliteration and the tendency towards monotonicity of alignment. Our most successful systems were based on classifier combination, and we found different combination methods performed best under the target evaluation metrics of F-measure and alignment error rate.
Minimum Bayes-Risk Word Alignments of Bilingual Texts
- In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP-02
, 2002
"... We present Minimum Bayes-Risk word alignment for machine translation. This statistical, model-based approach attempts to minimize the expected risk of alignment errors under loss functions that measure alignment quality. We describe various loss functions, including some that incorporate ling ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present Minimum Bayes-Risk word alignment for machine translation. This statistical, model-based approach attempts to minimize the expected risk of alignment errors under loss functions that measure alignment quality. We describe various loss functions, including some that incorporate linguistic analysis as can be obtained from parse trees, and show that these approaches can improve alignments of the English-French Hansards.
Juicer: A Weighted Finite-State Transducer speech decoder
"... Abstract. A major component in the development of any speech recognition system is the decoder. As task complexities and, consequently, system complexities have continued to increase the decoding problem has become an increasingly significant component in the overall speech recognition system develo ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. A major component in the development of any speech recognition system is the decoder. As task complexities and, consequently, system complexities have continued to increase the decoding problem has become an increasingly significant component in the overall speech recognition system development effort, with efficient decoder design contributing to significantly improve the trade-off between decoding time and search errors. In this paper we present the“Juicer”(from transducer) large vocabulary continuous speech recognition (LVCSR) decoder based on weighted finite-State transducer (WFST). We begin with a discussion of the need for open source, state-of-the-art decoding software in LVCSR research and how this lead to the development of Juicer, followed by a brief overview of decoding techniques and major issues in decoder design. We present Juicer and its major features, emphasising its potential not only as a critical component in the development of LVCSR systems, but also as an important research tool in itself, being based around the flexible WFST paradigm. We also provide results of benchmarking tests that have been carried out to date, demonstrating that in many respects Juicer, while still in its early development, is already achieving stateof-the-art. These benchmarking tests serve to not only demonstrate the utility of Juicer in its present state, but are also being used to guide future development, hence, we conclude with a brief discussion of some of the extensions that are currently under way or being considered for Juicer. 1

