Results 1 - 10
of
13
Using Syntax to Improve Word Alignment Precision for Syntax-Based Machine Translation
"... Word alignments that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntaxbased machine translation. We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules. We obtain gains in ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Word alignments that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntaxbased machine translation. We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules. We obtain gains in both alignment quality and translation quality in Chinese-English and Arabic-English translation experiments relative to a GIZA++ union baseline.
Faster beam-search decoding for phrasal statistical machine translation
- In Proceedings of MT Summit XI
, 2007
"... Pharaoh is a widely-used state-of-the-art decoder for phrasal statistical machine translation. In this paper, we present two modifications to the algorithm used by Pharaoh that together permit much faster decoding without losing translation quality as measured by BLEU score. The first modification i ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Pharaoh is a widely-used state-of-the-art decoder for phrasal statistical machine translation. In this paper, we present two modifications to the algorithm used by Pharaoh that together permit much faster decoding without losing translation quality as measured by BLEU score. The first modification improves the estimated translation model score used by Pharaoh to evaluate partial hypotheses, by incorporating an estimate of the distortion penalty to be incurred in translating the rest of the sentence. The second modification uses early pruning of possible next-phrase translations to cut down the overall size of the search space. These modifications enable decoding speed-ups of an order of magnitude or more, with no reduction in the BLEU score of the resulting translations. 1.
Discriminative Word Alignment via Alignment Matrix Modeling
"... In this paper a new discriminative word alignment method is presented. This approach models directly the alignment matrix by a conditional random field (CRF) and so no restrictions to the alignments have to be made. Furthermore, it is easy to add features and so all available information can be used ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
In this paper a new discriminative word alignment method is presented. This approach models directly the alignment matrix by a conditional random field (CRF) and so no restrictions to the alignments have to be made. Furthermore, it is easy to add features and so all available information can be used. Since the structure of the CRFs can get complex, the inference can only be done approximately and the standard algorithms had to be adapted. In addition, different methods to train the model have been developed. Using this approach the alignment quality could be improved by up to 23 percent for 3 different language pairs compared to a combination of both IBM4alignments. Furthermore the word alignment was used to generate new phrase tables. These could improve the translation quality significantly. 1
Using Word Dependent Transition Models in HMM based Word Alignment for Statistical Machine Translation
"... In this paper, we present a Bayesian Learning based method to train word dependent transition models for HMM based word alignment. We present word alignment results on the Canadian Hansards corpus as compared to the conventional HMM and IBM model 4. We show that this method gives consistent and sign ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this paper, we present a Bayesian Learning based method to train word dependent transition models for HMM based word alignment. We present word alignment results on the Canadian Hansards corpus as compared to the conventional HMM and IBM model 4. We show that this method gives consistent and significant alignment error rate (AER) reduction. We also conducted machine translation (MT) experiments on the Europarl corpus. MT results show that word alignment based on this method can be used in a phrase-based machine translation system to yield up to 1% absolute improvement in BLEU score, compared to a conventional HMM, and 0.8 % compared to a IBM model 4 based word alignment. 1
Yawat: Yet Another Word Alignment Tool
"... Yawat 1 is a tool for the visualization and manipulation of word- and phrase-level alignments of parallel text. Unlike most other tools for manual word alignment, it relies on dynamic markup to visualize alignment relations, that is, markup is shown and hidden depending on the current mouse position ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Yawat 1 is a tool for the visualization and manipulation of word- and phrase-level alignments of parallel text. Unlike most other tools for manual word alignment, it relies on dynamic markup to visualize alignment relations, that is, markup is shown and hidden depending on the current mouse position. This reduces the visual complexity of the visualization and allows the annotator to focus on one item at a time. For a bird’s-eye view of alignment patterns within a sentence, the tool is also able to display alignments as alignment matrices. In addition, it allows for manual labeling of alignment relations with customizable tag sets. Different text colors are used to indicate which words in a given sentence pair have already been aligned, and which ones still need to be aligned. Tag sets and color schemes can easily be adapted to the needs of specific annotation projects through configuration files. The tool is implemented in JavaScript and designed to run as a web application. 1
Improving Statistical Word Alignment with Various Clues
"... This paper proposes a method to improve word alignment by combining various clues. Our method first trains a baseline statistical IBM word alignment model. Then we improve it with various clues, which are mainly based on features such as lemmatization, translation dictionary, named entities, and chu ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper proposes a method to improve word alignment by combining various clues. Our method first trains a baseline statistical IBM word alignment model. Then we improve it with various clues, which are mainly based on features such as lemmatization, translation dictionary, named entities, and chunks. We incorporate these features into an unified framework. Experimental results show that our method improves word alignment quality by achieving a relative error rate reduction of 39.8%. We also conduct phrase-based machine translation based on the word alignment results. Using BLEU as an evaluation metric, our method achieves an absolute improvement of about 0.02 (about 18 % relative) over a baseline method.
Two Tools for Creating and Visualizing Sub-sentential Alignments of Parallel Text
"... We present two web-based, interactive tools for creating and visualizing sub-sentential alignments of parallel text. Yawat is a tool to support distributed, manual word- and phrase-alignment of parallel text through an intuitive, web-based interface. Kwipc is an interface for displaying words or bil ..."
Abstract
- Add to MetaCart
We present two web-based, interactive tools for creating and visualizing sub-sentential alignments of parallel text. Yawat is a tool to support distributed, manual word- and phrase-alignment of parallel text through an intuitive, web-based interface. Kwipc is an interface for displaying words or bilingual word pairs in parallel, word-aligned context. A key element of the tools presented here is the interactive visualization: alignment information is shown only for one pair of aligned words or phrases at a time. This allows users to explore the alignment space interactively without being overwhelmed by the amount of information available. 1
Discriminative Word Alignment with Syntactic Features
"... This report introduces a study on syntactic features used in a discriminative word alignment model. The features are implemented on a state-of-the-art discriminative word alignment system. The syntactic features are extracted from parse trees. Three types of syntactic features are experimented in th ..."
Abstract
- Add to MetaCart
This report introduces a study on syntactic features used in a discriminative word alignment model. The features are implemented on a state-of-the-art discriminative word alignment system. The syntactic features are extracted from parse trees. Three types of syntactic features are experimented in this work: one global tree path feature and two first order tree features. Experimental results show that the syntactic features are helpful for improving the word alignment accuracy on Chinese-English parallel sentences.
Selective Phrase Pair Extraction for Improved Statistical Machine Translation
"... Phrase-based statistical machine translation systems depend heavily on the knowledge represented in their phrase translation tables. However, the phrase pairs included in these tables are typically selected using simple heuristics that potentially leave much room for improvement. In this paper, we p ..."
Abstract
- Add to MetaCart
Phrase-based statistical machine translation systems depend heavily on the knowledge represented in their phrase translation tables. However, the phrase pairs included in these tables are typically selected using simple heuristics that potentially leave much room for improvement. In this paper, we present a technique for selecting the phrase pairs to include in phrase translation tables based on their estimated quality according to a translation model. This method not only reduces the size of the phrase translation table, but also improves translation quality as measured by the BLEU metric. 1

