• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A new string-to-dependency machine translation algorithm with a target dependency language model (2008)

by Libin Shen, Jinxi Xu, Ralph Weischedel
Venue:In Proc. of ACL
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 135
Next 10 →

Posterior regularization for structured latent variable models

by Kuzman Ganchev, João Graça, Lf Inesc-id, Jennifer Gillenwater, Ben Taskar - Journal of Machine Learning Research , 2010
"... We present posterior regularization, a probabilistic framework for structured, weakly supervised learning. Our framework efficiently incorporates indirect supervision via constraints on posterior distributions of probabilistic models with latent variables. Posterior regularization separates model co ..."
Abstract - Cited by 138 (8 self) - Add to MetaCart
We present posterior regularization, a probabilistic framework for structured, weakly supervised learning. Our framework efficiently incorporates indirect supervision via constraints on posterior distributions of probabilistic models with latent variables. Posterior regularization separates model complexity from the complexity of structural constraints it is desired to satisfy. By directly imposing decomposable regularization on the posterior moments of latent variables during learning, we retain the computational efficiency of the unconstrained model while ensuring desired constraints hold in expectation. We present an efficient algorithm for learning with posterior regularization and illustrate its versatility on a diverse set of structural constraints such as bijectivity, symmetry and group sparsity in several large scale experiments, including multi-view learning, cross-lingual dependency grammar induction, unsupervised part-of-speech induction,
(Show Context)

Citation Context

...lel text. Dependency grammars are one such resource. They are useful for language modeling, textual entailment and machine translation (Haghighi et al., 2005; Chelba et al., 1997; Quirk et al., 2005; =-=Shen et al., 2008-=-), to name a few tasks. Dependency grammars are arguably more robust to transfer than constituent grammars, since syntactic relations between aligned words of parallel sentences are better conserved i...

Hierarchical phrase-based translation with weighted finite state transducers and . . .

by Adria de Gispert, Gonzalo Iglesias, Graeme Blackwood, Eduardo R. Banga, William Byrne - IN PROCEEDINGS OF HLT/NAACL , 2010
"... In this article we describe HiFST, a lattice-based decoder for hierarchical phrase-based translation and alignment. The decoder is implemented with standard Weighted Finite-State Transducer (WFST) operations as an alternative to the well-known cube pruning procedure. We find that the use of WFSTs ra ..."
Abstract - Cited by 48 (20 self) - Add to MetaCart
In this article we describe HiFST, a lattice-based decoder for hierarchical phrase-based translation and alignment. The decoder is implemented with standard Weighted Finite-State Transducer (WFST) operations as an alternative to the well-known cube pruning procedure. We find that the use of WFSTs rather than k-best lists requires less pruning in translation search, resulting in fewer search errors, better parameter optimization, and improved translation performance. The direct generation of translation lattices in the target language can improve subsequent rescoring procedures, yielding further gains when applying long-span language models and Minimum Bayes Risk decoding. We also provide insights as to how to control the size of the search space defined by hierarchical rules. We show that shallow-n grammars, low-level rule catenation, and other search constraints can help to match the power of the translation system to specific language pairs.
(Show Context)

Citation Context

...wing Chappelier et al. (1999). Extensions to Hiero Several authors describe extensions to Hiero, to incorporate additional syntactic information (Zollmann and Venugopal, 2006; Zhang and Gildea, 2006; =-=Shen et al., 2008-=-; Marton and Resnik, 2008), or to combine it with discriminative latent models (Blunsom et al., 2008). Analysis and Contrastive Experiments Zollman et al. (2008) compare phrase-based, hierarchical and...

Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems

by Xiaodong He, Mei Yang, Jianfeng Gao, Patrick Nguyen, Robert Moore
"... This paper presents a new hypothesis alignment method for combining outputs of multiple machine translation (MT) systems. An indirect hidden Markov model (IHMM) is proposed to address the synonym matching and word ordering issues in hypothesis alignment. Unlike traditional HMMs whose parameters are ..."
Abstract - Cited by 38 (3 self) - Add to MetaCart
This paper presents a new hypothesis alignment method for combining outputs of multiple machine translation (MT) systems. An indirect hidden Markov model (IHMM) is proposed to address the synonym matching and word ordering issues in hypothesis alignment. Unlike traditional HMMs whose parameters are trained via maximum likelihood estimation (MLE), the parameters of the IHMM are estimated indirectly from a variety of sources including word semantic similarity, word surface similarity, and a distance-based distortion penalty. The IHMM-based method significantly outperforms the state-of-the-art TER-based alignment model in our experiments on NIST benchmark datasets. Our combined SMT system using the

Dependency grammar induction via bitext projection constraints

by Kuzman Ganchev, Jennifer Gillenwater, Ben Taskar - In ACL-IJCNLP , 2009
"... Broad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and di ..."
Abstract - Cited by 35 (5 self) - Add to MetaCart
Broad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser (English) to constrain the space of possible target trees. Unlike previous approaches, our framework does not require full projected parses, allowing partial, approximate transfer through linear expectation constraints on the space of distributions over trees. We consider several types of constraints that range from generic dependency conservation to language-specific annotation rules for auxiliary verb analysis. We evaluate our approach on Bulgarian and Spanish CoNLL shared task data and show that we consistently outperform unsupervised methods and can outperform supervised learning for limited training data. 1
(Show Context)

Citation Context

...04; Smith and Eisner, 2006). Dependency representation has been used for language modeling, textual entailment and machine translation (Haghighi et al., 2005; Chelba et al., 1997; Quirk et al., 2005; =-=Shen et al., 2008-=-), to name a few tasks. Dependency grammars are arguably more robust to transfer since syntactic relations between aligned words of parallel sentences are better conserved in translation than phrase s...

Jane: Open Source Hierarchical Translation, Extended with Reordering and Lexicon Models

by David Vilar, Daniel Stein, Matthias Huck, Hermann Ney, Lehrstuhl Für Informatik
"... We present Jane, RWTH’s hierarchical phrase-based translation system, which has been open sourced for the scientific community. This system has been in development at RWTH for the last two years and has been successfully applied in different machine translation evaluations. It includes extensions to ..."
Abstract - Cited by 31 (25 self) - Add to MetaCart
We present Jane, RWTH’s hierarchical phrase-based translation system, which has been open sourced for the scientific community. This system has been in development at RWTH for the last two years and has been successfully applied in different machine translation evaluations. It includes extensions to the hierarchical approach developed by RWTH as well as other research institutions. In this paper we give an overview of its main features. We also introduce a novel reordering model for the hierarchical phrase-based approach which further enhances translation performance, and analyze the effect some recent extended lexicon models have on the performance of the system. 1
(Show Context)

Citation Context

...nted in (Venugopal et al., 2009), where the information about the new non-terminals is included as an additional feature in the log-linear model. In addition, dependency information in the spirit of (=-=Shen et al., 2008-=-) is included. Jane features models for string-to-dependency language models and computes various scores based on the well-formedness of the resulting dependency tree. Jane supports the Stanford parsi...

Capturing Practical Natural Language Transformations

by Kevin Knight
"... We study automata for capturing transformations employed by practical natural language processing systems, such as those that translate between human languages. For several variations of finite-state string and tree transducers, we ask formal questions about expressiveness, modularity, teachability, ..."
Abstract - Cited by 29 (0 self) - Add to MetaCart
We study automata for capturing transformations employed by practical natural language processing systems, such as those that translate between human languages. For several variations of finite-state string and tree transducers, we ask formal questions about expressiveness, modularity, teachability, and generalization.

Synchronous Tree Adjoining Machine Translation

by Steve Deneefe, Kevin Knight - In Proceedings of EMNLP , 2009
"... Tree Adjoining Grammars have well-known advantages, but are typically considered too difficult for practical systems. We demonstrate that, when done right, adjoining improves translation quality without becoming computationally intractable. Using adjoining to model optionality allows general transla ..."
Abstract - Cited by 27 (2 self) - Add to MetaCart
Tree Adjoining Grammars have well-known advantages, but are typically considered too difficult for practical systems. We demonstrate that, when done right, adjoining improves translation quality without becoming computationally intractable. Using adjoining to model optionality allows general translation patterns to be learned without the clutter of endless variations of optional material. The appropriate modifiers can later be spliced in as needed. In this paper, we describe a novel method for learning a type of Synchronous Tree Adjoining Grammar and associated probabilities from aligned tree/string training data. We introduce a method of converting these grammars to a weakly equivalent tree transducer for decoding. Finally, we show that adjoining results in an end-to-end improvement of +0.8 BLEU over a baseline statistical syntax-based MT model on a large-scale Arabic/English MT task. 1

Rule filtering by pattern for efficient hierarchical translation

by Gonzalo Iglesias, Adrià Gispert - In Proceedings of the EACL , 2009
"... We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on th ..."
Abstract - Cited by 25 (4 self) - Add to MetaCart
We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on the number of non-terminals and the pattern, and various filtering strategies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST Arabic-to-English evaluation task. 1
(Show Context)

Citation Context

... of translations with further time reductions at no cost in translation scores. This is in direct contrast to recent reported results in which other filtering strategies lead to degraded performance (=-=Shen et al., 2008-=-; Zollmann et al., 2008). We find that certain patterns are of much greater value in translation than others and that separate minimum count filters should be applied accordingly. Some patterns were f...

Improving Tree-to-Tree Translation with Packed Forests

by Yang Liu, Yajuan Lü, Qun Liu
"... Current tree-to-tree models suffer from parsing errors as they usually use only 1-best parses for rule extraction and decoding. We instead propose a forest-based tree-to-tree model that uses packed forests. The model is based on a probabilistic synchronous tree substitution grammar (STSG), which can ..."
Abstract - Cited by 24 (7 self) - Add to MetaCart
Current tree-to-tree models suffer from parsing errors as they usually use only 1-best parses for rule extraction and decoding. We instead propose a forest-based tree-to-tree model that uses packed forests. The model is based on a probabilistic synchronous tree substitution grammar (STSG), which can be learned from aligned forest pairs automatically. The decoder finds ways of decomposing trees in the source forest into elementary trees using the source projection of STSG while building target forest in parallel. Comparable to the state-of-the-art phrase-based system Moses, using packed forests in tree-to-tree translation results in a significant absolute improvement of 3.6 BLEU points over using 1-best trees. 1
(Show Context)

Citation Context

...nnotations, either in the form of phrase structure trees or dependency trees. They can be roughly divided into three categories: string-to-tree models (e.g., (Galley et al., 2006; Marcu et al., 2006; =-=Shen et al., 2008-=-)), tree-to-string models (e.g., (Liu et al., 2006; Huang et al., 2006)), and tree-totree models (e.g., (Eisner, 2003; Ding and Palmer, 2005; Cowan et al., 2006; Zhang et al., 2008)). By modeling the ...

Effective use of linguistic and contextual information for statistical machine translation

by Libin Shen, Jinxi Xu, Bing Zhang, Spyros Matsoukas, Ralph Weischedel - In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing , 2008
"... Current methods of using lexical features in machine translation have difficulty in scaling up to realistic MT tasks due to a prohibitively large number of parameters involved. In this paper, we propose methods of using new linguistic and contextual features that do not suffer from this problem and ..."
Abstract - Cited by 23 (2 self) - Add to MetaCart
Current methods of using lexical features in machine translation have difficulty in scaling up to realistic MT tasks due to a prohibitively large number of parameters involved. In this paper, we propose methods of using new linguistic and contextual features that do not suffer from this problem and apply them in a state-ofthe-art hierarchical MT system. The features used in this work are non-terminal labels, non-terminal length distribution, source string context and source dependency LM scores. The effectiveness of our techniques is demonstrated by significant improvements over a strong baseline. On Arabic-to-English translation, improvements in lower-cased BLEU are
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University