## Machine Translation with Inferred Stochastic Finite-State Transducers (2004)

### Cached

### Download Links

- [www.cs.mun.ca]
- [acl.ldc.upenn.edu]
- [www.aclweb.org]
- [wing.comp.nus.edu.sg]
- [aclweb.org]
- [www.aclweb.org]
- [aclweb.org]
- DBLP

### Other Repositories/Bibliography

Venue: | COMPUTATIONAL LINGUISTICS |

Citations: | 58 - 14 self |

### BibTeX

@ARTICLE{Casacuberta04machinetranslation,

author = {Francisco Casacuberta and Enrique Vidal},

title = {Machine Translation with Inferred Stochastic Finite-State Transducers},

journal = {COMPUTATIONAL LINGUISTICS},

year = {2004},

volume = {30},

pages = {205--225}

}

### Years of Citing Articles

### OpenURL

### Abstract

Finite-state transducers are models that are being used in different areas of pattern recognition and computational linguistics. One of these areas is machine translation, in which the approaches that are based on building models automatically from training examples are becoming more and more attractive. Finite-state transducers are veryadequate for use in constrained tasks in which training samples of pairs of sentences are available. A technique for inferring finite-state transducers is proposed in this article. This technique is based on formalrelations between finite-state transducers and rational grammars. Given a training corpus of source-target pairs of sentences, the proposed approach uses statistical alignment methods to produce a set of conventional strings from which a stochastic rational grammar (e.g., an n-gram) is inferred. This grammar is finally converted into a finite-state transducer. The proposed methods are assessed through a series of machine translation experiments within the framework of the EuTrans project.

### Citations

1477 | Bleu: a method for automatic evaluation of machine translation
- Papineni, Roukos, et al.
- 2001
(Show Context)
Citation Context ...f a direct comparison between the hypothesized and reference word strings as a whole. The BLEU metric is based on the n-grams of the hypothesized translation that occur in the reference translations (=-=Papineni et al 2001-=-). The BLEU metric ranges from 0.0 (worst score) to 1.0 (best score). 5.1 The Spanish-English Translation Tasks A Spanish-English corpus was semi-automatically generated in the first phase of the EuTr... |

451 | Improved statistical alignment models
- Och, Ney
- 2000
(Show Context)
Citation Context ...wn as models 1 through 5. Adequate software packages are publicly available for training these statistical models and for obtaining good alignments between pairs of sentences (Al-Onaizan et al. 1999; =-=Och and Ney 2000-=-). An example of Spanish-English sentence alignment is given below: Example 1 ¿Cuánto cuesta una habitación individual por semana ? how (2) much (2) does (3) a (4) single (6) room (5) cost (3) per (7)... |

303 | Finite-state transducers in language and speech processing
- Mohri
- 1997
(Show Context)
Citation Context ...the EuTrans project. 1. Introduction Formal transducers give rise to an important framework in syntactic-pattern recognition (Fu 1982; Vidal, Casacuberta, and García 1995) and in language processing (=-=Mohri 1997-=-). Many tasks in automatic speech recognition can be viewed as simple translations from acoustic sequences to sublexical or lexical sequences (acoustic-phonetic decoding) or from acoustic or lexical s... |

204 |
Syntactic Pattern Recognition and Applications
- Fu
- 1982
(Show Context)
Citation Context ...rough a series of machine translation experiments within the framework of the EuTrans project. 1. Introduction Formal transducers give rise to an important framework in syntactic-pattern recognition (=-=Fu 1982-=-; Vidal, Casacuberta, and García 1995) and in language processing (Mohri 1997). Many tasks in automatic speech recognition can be viewed as simple translations from acoustic sequences to sublexical or... |

99 | Learning subsequential transducers for pattern recognition interpretation tasks - Oncina, García, et al. - 1993 |

63 | The CMU Statistical Language Modeling Toolkit, and its use
- Rosenfeld
- 1995
(Show Context)
Citation Context ...−1) where c(·) is the number of times that an event occurs in the training set. To deal with unseen n-grams, the back-off smoothing technique from the CMU Statistical Language Modeling (SLM) Toolkit (=-=Rosenfeld 1995-=-) has been used. The (smoothed) n-gram model obtained from the set of extended symbols is represented as a stochastic finite-state automaton (Llorens, Vilar, and Casacuberta 2002). The states of the a... |

63 | Finite-state speech-to-speech translation
- Vidal
- 1997
(Show Context)
Citation Context ...quences (acoustic-phonetic decoding) or from acoustic or lexical sequences to query strings (for database access) or (robot control) commands (semantic decoding) (Vidal, Casacuberta, and García 1995; =-=Vidal 1997-=-; Bangalore and Ricardi 2000a, 2000b; Hazen, Hetherington, and Park 2001; Mou, Seneff, and Zue 2001; Segarra et al. 2001; Seward 2001). Another similar application is the recognition of continuous han... |

49 | Statistical Language Modeling Using Leaving-One-Out - Ney, Martin, et al. - 1997 |

48 | The mathematics of statistical machine translation: Parameter estimation - Mercer - 1993 |

28 |
Transductions and Context-Free
- Berstel
- 1979
(Show Context)
Citation Context ...nslation, in which input and output can be text, speech, (continuous) handwritten text, etc. (Mohri 1997; Vidal 1997; Bangalore and Ricardi 2000b, 2001; Amengual et al. 2000). Rational transductions (=-=Berstel 1979-=-) constitute an important class within the field of formal translation. These transductions are realized by the so-called finite-state transducers. Even though other, more powerful transduction models... |

22 | The EUTRANS-I Speech Translation System
- Amengual, Benedí, et al.
- 2000
(Show Context)
Citation Context ...plication of formal transducers is language translation, in which input and output can be text, speech, (continuous) handwritten text, etc. (Mohri 1997; Vidal 1997; Bangalore and Ricardi 2000b, 2001; =-=Amengual et al. 2000-=-). Rational transductions (Berstel 1979) constitute an important class within the field of formal translation. These transductions are realized by the so-called finite-state transducers. Even though o... |

19 | Computational complexity of problems on probabilistic grammars and transducers - Casacuberta, Higuera, et al. - 2000 |

15 | Vilar: Speech-to-speech translation based on finite-state transducers
- Casacuberta, Llorens, et al.
- 2001
(Show Context)
Citation Context ... 1997). The domain of the corpus involved typical human-tohuman communication situations at a reception desk of a hotel. A summary of this corpus (EuTrans-0) is given in Table 1 (Amengual et al 2000; =-=Casacuberta et al. 2001-=-). From this (large) corpus, a small subset of ten thousand training sentence pairs (EuTrans-I) was randomly selected in order to approach more realistic training conditions (see also Table 1). From t... |

11 |
Maximum mutual information and conditional maximum likelihood estimations of stochastic syntax-directed translation schemes
- Casacuberta
- 1996
(Show Context)
Citation Context ...(s, t) = max PTP (φ) (6) φ∈d(s,t) An approximate translation can now be computed as ˜t = argmax t∈∆⋆ VTP (s, t) =argmax t∈∆ ⋆ max φ∈d(s,t) PTP (φ) (7) This computation can be carried out efficiently (=-=Casacuberta 1996-=-) by solving the following recurrence by means of dynamic programming: � � max VTP (s, t) = max V(|s|, q) · f (q) (8) t∈∆∗ q∈Q V(i, q) = max q ′ ∈Q,w∈∆ ⋆ � V(i − 1, q ′ ) · p(q ′ , si, w, q) � if i �=... |

10 | Inference of finite-state transducers from regular languages - Casacuberta, Vidal, et al. - 2005 |

9 | FST-based recognition techniques for multi-lingual and multi-domain spontaneous speech - Hazen - 2001 |

9 | Grammatical Inference and Automatic Speech Recognition - VIDAL, CASACUBERTA, et al. - 1994 |

8 |
Extracting semantic information through automatic learning techniques
- Segarra, Sanchis, et al.
- 2002
(Show Context)
Citation Context ...s) or (robot control) commands (semantic decoding) (Vidal, Casacuberta, and García 1995; Vidal 1997; Bangalore and Ricardi 2000a, 2000b; Hazen, Hetherington, and Park 2001; Mou, Seneff, and Zue 2001; =-=Segarra et al. 2001-=-; Seward 2001). Another similar application is the recognition of continuous hand-written characters (González et al. 2000). Yet a more complex application of formal transducers is language translatio... |

5 | Context-dependent Probabilistic Hierarchical Sub-lexical Modelling Using Finite State Transducers - Mou, Seneff, et al. - 2001 |

4 | Finite state language models smoothed using n-grams - Llorens, Vilar, et al. - 2002 |

4 | Inferring Finite Transducers
- Mäkinen
- 1999
(Show Context)
Citation Context ...ológico de Informática, 46071 Valencia, Spain. E-mail:{fcn, evidal}@iti.upv.es. c○ 2004 Association for Computational LinguisticssComputational Linguistics Volume 30, Number 2 García, and Vidal 1993; =-=Mäkinen 1999-=-; Knight and Al-Onaizan 1998; Bangalore and Ricardi 2000b; Casacuberta 2000; Vilar 2000). Nevertheless, there are many techniques for inferring regular grammars from finite sets of learning strings wh... |

4 | Inductive Learning of Finite-State Transducers for the Interpretation of Unidimensional Objects - Vidal, Garc'ia - 1990 |

3 |
Off-line recognition of syntax-constrained cursive handwritten text
- González, Salvador, et al.
- 2000
(Show Context)
Citation Context ...i 2000a, 2000b; Hazen, Hetherington, and Park 2001; Mou, Seneff, and Zue 2001; Segarra et al. 2001; Seward 2001). Another similar application is the recognition of continuous hand-written characters (=-=González et al. 2000-=-). Yet a more complex application of formal transducers is language translation, in which input and output can be text, speech, (continuous) handwritten text, etc. (Mohri 1997; Vidal 1997; Bangalore a... |

1 | Finite-state models for lexical reordering in spoken language translation - 2000a |

1 | Stochastic finite-state models for spoken language machine translation - 2000b |

1 | Computational Linguistics Volume 30, Number 2 - García, Vidal, et al. - 1987 |

1 | Statistical modeling techniques and results and search techniques and results - Aachen, ITI - 1999 |

1 | Transducer optimizations for tight-coupled decoding
- Seward
- 2001
(Show Context)
Citation Context ... commands (semantic decoding) (Vidal, Casacuberta, and García 1995; Vidal 1997; Bangalore and Ricardi 2000a, 2000b; Hazen, Hetherington, and Park 2001; Mou, Seneff, and Zue 2001; Segarra et al. 2001; =-=Seward 2001-=-). Another similar application is the recognition of continuous hand-written characters (González et al. 2000). Yet a more complex application of formal transducers is language translation, in which i... |

1 | Jose-Miguel Bened´õ , Francisco Casacuberta, Asunción Casta no, Antonio Castellanos, V´õ ctor Jiménez - Amengual - 2000 |

1 | Stochastic �nite-state models for spoken language machine translation - 2000b |

1 |
Speech-to-speech translation based on �nite-state transducers
- Casacuberta, Llorens, et al.
- 2001
(Show Context)
Citation Context ... 1997). The domain of the corpus involved typical human-tohuman communication situations at a reception desk of a hotel. A summary of this corpus (EuTrans-0) is given in Table 1 (Amengual et al 2000; =-=Casacuberta et al. 2001-=-). From this (large) corpus, a small subset of ten thousand training sentence pairs (EuTrans-I) was randomly selected in order to approach more realistic training conditions (see also Table 1). From t... |

1 | Inference of �nite-state transducers from regular languages - Casacuberta, Vidal, et al. - 2004 |

1 | Computational Linguistics Volume 30, Number 2 Garc´õ a - Pedro, Casacuberta - 1987 |

1 |
Inferring �nite transducers
- Mäkinen
- 1999
(Show Context)
Citation Context ...ico de Informática, 46071 Valencia, Spain. E-mail:ffcn, evidalg@iti.upv.es. ® c 2004 Association for Computational LinguisticssComputational Linguistics Volume 30, Number 2 Garc´ õ a, and Vidal 1993; =-=Mäkinen 1999-=-; Knight and Al-Onaizan 1998; Bangalore and Ricardi 2000b; Casacuberta 2000; Vilar 2000). Nevertheless, there are many techniques for inferring regular grammars from �nite sets of learning strings whi... |

1 | Mar´õ a-Isabel Galiano Emilio Sanchis, Fernando Garc´õ a, and Luis Hurtado. 2001. Extracting semantic information through automatic learning techniques - Segarra |

1 | Inductive learning of �nite-state transducers for the interpretation of unidimensional objects - Vidal, a, et al. - 1989 |