Results 11 - 20
of
20
A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
"... In this work we propose methods to label probabilistic synchronous context-free grammar (PSCFG) rules using only word tags, generated by either part-of-speech analysis or unsupervised word class induction. The proposals range from simple tag-combination schemes to a phrase clustering model that can ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this work we propose methods to label probabilistic synchronous context-free grammar (PSCFG) rules using only word tags, generated by either part-of-speech analysis or unsupervised word class induction. The proposals range from simple tag-combination schemes to a phrase clustering model that can incorporate an arbitrary number of features. Our models improve translation quality over the single generic label approach of Chiang (2005) and perform on par with the syntactically motivated approach from Zollmann and Venugopal (2006) on the NIST large Chineseto-English translation task. These results persist when using automatically learned word tags, suggesting broad applicability of our technique across diverse language pairs for which syntactic resources are not available.
MaTrEx: The DCU Machine Translation System for ICON 2008
, 2008
"... In this paper, we give a description of the machine translation system developed at DCU that was used for our participation in the NLP Tools Contest of the International Conference on Natural Language Processing (ICON 2008). This was our first ever attempt at working on any Indian language. In this ..."
Abstract
- Add to MetaCart
In this paper, we give a description of the machine translation system developed at DCU that was used for our participation in the NLP Tools Contest of the International Conference on Natural Language Processing (ICON 2008). This was our first ever attempt at working on any Indian language. In this participation, we focus on various techniques for word and phrase alignment to improve system quality. For the English–Hindi translation task we exploit source-language reordering. We also carried out experiments combining both in-domain and out-of-domain data to improve the system performance and, as a post-processing step we transliterate outof-vocabulary items.
HPSG Supertagging: A Sequence Labeling View
"... Supertagging is a widely used speed-up technique for deep parsing. In another aspect, supertagging has been exploited in other NLP tasks than parsing for utilizing the rich syntactic information given by the supertags. However, the performance of supertagger is still a bottleneck for such applicatio ..."
Abstract
- Add to MetaCart
Supertagging is a widely used speed-up technique for deep parsing. In another aspect, supertagging has been exploited in other NLP tasks than parsing for utilizing the rich syntactic information given by the supertags. However, the performance of supertagger is still a bottleneck for such applications. In this paper, we investigated the relationship between supertagging and parsing, not just to speed up the deep parser; We started from a sequence labeling view of HPSG supertagging, examining how well a supertagger can do when separated from parsing. Comparison of two types of supertagging model, point-wise model and sequential model, showed that the former model works competitively well despite its simplicity, which indicates the true dependency among supertag assignments is far more complex than the crude first-order approximation made in the sequential model. We then analyzed the limitation of separated supertagging by using a CFG-filter. The results showed that big gains could be acquired by resorting to a light-weight parser. 1
Fine-grained Tree-to-String Translation Rule Extraction
"... Tree-to-string translation rules are widely used in linguistically syntax-based statistical machine translation systems. In this paper, we propose to use deep syntactic information for obtaining fine-grained translation rules. A head-driven phrase structure grammar (HPSG) parser is used to obtain th ..."
Abstract
- Add to MetaCart
Tree-to-string translation rules are widely used in linguistically syntax-based statistical machine translation systems. In this paper, we propose to use deep syntactic information for obtaining fine-grained translation rules. A head-driven phrase structure grammar (HPSG) parser is used to obtain the deep syntactic information, which includes a fine-grained description of the syntactic property and a semantic representation of a sentence. We extract fine-grained rules from aligned HPSG tree/forest-string pairs and use them in our tree-to-string and string-to-tree systems. Extensive experiments on largescale bidirectional Japanese-English translations testified the effectiveness of our approach. 1
Shallow-Syntax Phrase-Based Translation: Joint versus Factored String-to-Chunk Models
"... This work extends phrase-based statistical MT (SMT) with shallow syntax dependencies. Two string-to-chunks translation models are proposed: a factored model, which augments phrase-based SMT with layered dependencies, and a joint model, that extends the phrase translation table with microtags, i.e. p ..."
Abstract
- Add to MetaCart
This work extends phrase-based statistical MT (SMT) with shallow syntax dependencies. Two string-to-chunks translation models are proposed: a factored model, which augments phrase-based SMT with layered dependencies, and a joint model, that extends the phrase translation table with microtags, i.e. perword projections of chunk labels. Both rely on n-gram models of target sequences with different granularity: single words, microtags, chunks. In particular, n-grams defined over syntactic chunks should model syntactic constraints coping with word-group movements. Experimental analysis and evaluation conducted on two popular Chinese-English tasks suggest that the shallow-syntax jointtranslation model has potential to outperform state-of-the-art phrase-based translation, with a reasonable computational overhead. 1
Candidacy Examination
"... What empirical evidence is there that adding syntactic constraints to MT decoding particular, PMT decoding will lead to improvements in translation quality? Your proposal claims that your method for adding syntactic constraints will result not only in a more complete search of the space of string pe ..."
Abstract
- Add to MetaCart
What empirical evidence is there that adding syntactic constraints to MT decoding particular, PMT decoding will lead to improvements in translation quality? Your proposal claims that your method for adding syntactic constraints will result not only in a more complete search of the space of string permutations involved in PMT but also in an improved ability to discriminate between good and bad translations. In Section 3 you claim that the ability to account for syntactically governed re-ordering patterns is an advantage and in Section 4 you claim, on the basis of a constructed example, that your proposed method will improve quality by removing ungrammatical but high scoring distractor analyses, and that the completeness of the search will be improved by reducing the need for aggressive heuristics about re-ordering. Do you anticipate that separate constraints on re-ordering will still be required? If not, say why not. If so, brie y sketch how these constraints will be implemented and the means by which they will interact with the new syntactic constraints. Statistical MT (SMT) systems are based on the source-channel model of communication (Weaver, 1949; Brown et al., 1993, 1990) whereby an output string is modelled as being
Forest-guided Supertagger Training
"... Supertagging is an important technique for deep syntactic analysis. A supertagger is usually trained independently of the parser using a sequence labeling method. This presents an inconsistent training objective between the supertagger and the parser. In this paper, we propose a forest-guided supert ..."
Abstract
- Add to MetaCart
Supertagging is an important technique for deep syntactic analysis. A supertagger is usually trained independently of the parser using a sequence labeling method. This presents an inconsistent training objective between the supertagger and the parser. In this paper, we propose a forest-guided supertagger training method to alleviate this problem by incorporating global grammar constraints into the supertagging process using a CFGfilter. It also provides an approach to make the supertagger and the parser more tightly integrated. The experiment shows that using the forest-guided trained supertagger, the parser got an absolute 0.68% improvement from baseline in F-score for predicate-argument relation recognition accuracy and achieved a competitive result of 89.31 % with a faster parsing speed, compared to a state-of-the-art HPSG parser. 1
Using Categorial Grammar to Label Translation Rules
"... Adding syntactic labels to synchronous context-free translation rules can improve performance, but labeling with phrase structure constituents, as in GHKM (Galley et al., 2004), excludes potentially useful translation rules. SAMT (Zollmann and Venugopal, 2006) introduces heuristics to create new non ..."
Abstract
- Add to MetaCart
Adding syntactic labels to synchronous context-free translation rules can improve performance, but labeling with phrase structure constituents, as in GHKM (Galley et al., 2004), excludes potentially useful translation rules. SAMT (Zollmann and Venugopal, 2006) introduces heuristics to create new non-constituent labels, but these heuristics introduce many complex labels and tend to add rarely-applicable rules to the translation grammar. We introduce a labeling scheme based on categorial grammar, which allows syntactic labeling of many rules with a minimal, well-motivated label set. We show that our labeling scheme performs comparably to SAMT on an Urdu–English translation task, yet the label set is an order of magnitude smaller, and translation is twice as fast.

