## An Iterative, DP-based Search Algorithm for Statistical Machine Translation (1998)

### Cached

### Download Links

Venue: | In Proceedings of the International Conference on Spoken Language Processing (ICSLP’98 |

Citations: | 14 - 4 self |

### BibTeX

@INPROCEEDINGS{García-varea98aniterative,,

author = {Ismael García-varea and Francisco Casacuberta and Hermann Ney},

title = {An Iterative, DP-based Search Algorithm for Statistical Machine Translation},

booktitle = {In Proceedings of the International Conference on Spoken Language Processing (ICSLP’98},

year = {1998},

pages = {1135--1138}

}

### Years of Citing Articles

### OpenURL

### Abstract

The increasing interest in the statistical approach to Machine Translation is due to the development of effective algorithms for training the probabilistic models proposed so far. However, one of the open problems with Statistical Machine Translation is the design of efficient algorithms for translating a given input string. For some interesting models, only (good) approximate solutions can be found. Recently a Dynamic Programming-like algorithm has been introduced which computes approximate solutions for some models. These solutions can be improved by using an iterative algorithm that refines the succesive solutions and uses a smoothing technique for some probabilistic distribution of the models based on an interpolation of different distributions. The technique resulting from this combination has been tested on the “Tourist Task ” corpus, which was generated in a semi-automated way. The best results achieved were a word-error rate of 9.3% and a sentence-error rate of 44.4%. 1.

### Citations

1327 | The mathematics of statistical machine translation: Parameter estimation - Brown, Pietra, et al. - 1993 |

658 | A statistical approach to machine translation
- Brown, Cocke, et al.
- 1990
(Show Context)
Citation Context ...re a word-error rate of 9.3% and a sentence-error rate of 44.4%. 1. INTRODUCTION The statistical approach is an adequate framework for introducing automatic learning techniques in Machine Translation =-=[3, 14, 5, 15]-=-. Under this framework, given an input string s from Ë � (S is a finite input alphabet and Ë � is the set of finite length strings over Ë), the probabilistic translation of s is an output string, �� �... |

362 |
Interpolated Estimation of Markov Source Parameters from Sparse Data
- Jelinek, Mercer
- 1980
(Show Context)
Citation Context ...atively small amount of training data. This is a typical problem in language modeling, above all when they are modeled with n-grams. To solve this problem, a lot of well-known techniques was proposed =-=[7, 8, 9]-=-. One of these techniques has been used for smoothing the distribution of alignment probabilities (« ���� �×�� ��� ) shown in equation (5). The training data for these distributions are quite sparse d... |

236 | Hmm-based word alignment in statistical translation
- Vogel, Ney, et al.
- 1996
(Show Context)
Citation Context ...re a word-error rate of 9.3% and a sentence-error rate of 44.4%. 1. INTRODUCTION The statistical approach is an adequate framework for introducing automatic learning techniques in Machine Translation =-=[3, 14, 5, 15]-=-. Under this framework, given an input string s from Ë � (S is a finite input alphabet and Ë � is the set of finite length strings over Ë), the probabilistic translation of s is an output string, �� �... |

83 | A Word-to-Word Model of Translational Equivalence
- Melamed
- 1997
(Show Context)
Citation Context ...re a word-error rate of 9.3% and a sentence-error rate of 44.4%. 1. INTRODUCTION The statistical approach is an adequate framework for introducing automatic learning techniques in Machine Translation =-=[3, 14, 5, 15]-=-. Under this framework, given an input string s from Ë � (S is a finite input alphabet and Ë � is the set of finite length strings over Ë), the probabilistic translation of s is an output string, �� �... |

72 | Finite-state speech-to-speech translation
- Vidal
- 1997
(Show Context)
Citation Context ... of each iteration is Ç �×� ¢ ÁÑ�Ü ¢ ÒÁ ¢��� ,whereÁÑ�Ü is the maximum output length allowed and ÒÁ is the number of output lengths tested. 5. EXPERIMENTS AND RESULTS We selected the “Traveller Task” =-=[13]-=- to experiment with the search algorithm proposed here. The general domain of the task was a visit by a tourist to a foreign country. This domain included a great variety of different scenarios, from ... |

59 | Accelerated DP Based Search for Statistical Translation
- Tillmann, Vogel, et al.
- 1997
(Show Context)
Citation Context ...using on the well-formed output strings. Interesting Translation Models were proposed in [4] and in [14]. With the model proposed in [14], a Dynamic Programming algorithm can be designed to solve (2) =-=[10, 11]-=-. However, the corresponding algorithms for the models 1 to 5 in [4] are based on a certain type of the � � algorithm [3, 15]. The computational cost of this type of algorithms depends on the heuristi... |

49 | A DP-based search using monotone alignments in statistical translation
- Tillmann, Vogel, et al.
- 1997
(Show Context)
Citation Context ...using on the well-formed output strings. Interesting Translation Models were proposed in [4] and in [14]. With the model proposed in [14], a Dynamic Programming algorithm can be designed to solve (2) =-=[10, 11]-=-. However, the corresponding algorithms for the models 1 to 5 in [4] are based on a certain type of the � � algorithm [3, 15]. The computational cost of this type of algorithms depends on the heuristi... |

44 | Decoding algorithm in statistical machine translation
- Wang, Waibel
- 1997
(Show Context)
Citation Context |

37 |
Estimation of probabilities in the language model of the ibm speech recognition system
- Nadas
- 1984
(Show Context)
Citation Context ...atively small amount of training data. This is a typical problem in language modeling, above all when they are modeled with n-grams. To solve this problem, a lot of well-known techniques was proposed =-=[7, 8, 9]-=-. One of these techniques has been used for smoothing the distribution of alignment probabilities (« ���� �×�� ��� ) shown in equation (5). The training data for these distributions are quite sparse d... |

13 |
A fast algorithm for deleted interpolation
- Bahl, Brown, et al.
- 1991
(Show Context)
Citation Context ...sed in [6]. One of the problems of inferring statistical distribution from finite data is the problem of “unseen” events. So far, different techniques have been proposed for dealing with this problem =-=[2]-=- in language and in acoustic modeling. We have chosen an interpolation of a probabilistic distribution with different degrees of precision. 2. A STATISTICAL MODEL FOR MACHINE TRANSLATION The Translati... |

13 |
Learning language models through the ECGI method. Speech Communication
- Prieto, Vidal
- 1992
(Show Context)
Citation Context ...r names, dates, hours and numbers. In both cases, the algorithm was proven using the smoothed translation model.sThe output language model was a Stochastic Regular Grammar built by the ECGI algorithm =-=[12]-=-. The output test-set perplexity of the inferred ECGI grammar was 3.53. We tested the number of iterations for the proposed algorithm. There was no improvement in the word error rate when the number o... |

7 | Definition of a machine translation task and generation of corpora - Amengual, Benedí, et al. - 1996 |

3 |
A Search Procedure for Statistical Translation
- García-Varea, Casacuberta
(Show Context)
Citation Context ...1 to 5 in [4] are based on a certain type of the � � algorithm [3, 15]. The computational cost of this type of algorithms depends on the heuristics introduced. To overcome this problem we proposed in =-=[6]-=- a linear time approach on the total size of training data, that was based on a single Dynamic Programming-like algorithm which computes approximate solutions when the known IBM-Model2 from [4] is use... |

1 |
Estimation of Probabilities from Sparse Data for The Lenguage Model Component of a Speech Recognizer
- Katz
- 1987
(Show Context)
Citation Context ...atively small amount of training data. This is a typical problem in language modeling, above all when they are modeled with n-grams. To solve this problem, a lot of well-known techniques was proposed =-=[7, 8, 9]-=-. One of these techniques has been used for smoothing the distribution of alignment probabilities (« ���� �×�� ��� ) shown in equation (5). The training data for these distributions are quite sparse d... |