Results 1 -
8 of
8
Wider Pipelines: N-Best Alignments and Parses in MT Training
"... State-of-the-art statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
State-of-the-art statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline of independently developed components, errors made in these “single-best ” hypotheses can propagate to downstream estimation steps that treat these inputs as clean, trustworthy training data. In this work we integrate N-best alignments and parses by using a probability distribution over these alternatives to generate posterior fractional counts for use in downstream estimation. Using these fractional counts in a DOPinspired syntax-based translation system, we show significant improvements in translation quality over a single-best trained baseline. 1
One Decade of Statistical Machine Translation: 1996-2005
- In Proceedings of the 10th MT Summit
, 2005
"... In the last decade, the statistical approach has found widespread use in machine translation both for written and spoken language and has had a major impact on the translation accuracy. This paper will cover the principles of statistical machine translation and summarize the progress made so far. 1 ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In the last decade, the statistical approach has found widespread use in machine translation both for written and spoken language and has had a major impact on the translation accuracy. This paper will cover the principles of statistical machine translation and summarize the progress made so far. 1
Grammar based statistical MT on Hadoop
, 2009
"... An end-to-end toolkit for large scale PSCFG based MT ..."
Preference Grammars and Decoding Algorithms for Probabilistic Synchronous Context Free Grammar Based Translation.
"... Probabilistic Synchronous Context-free Grammars (PSCFGs) [Aho and Ullmann, 1969, Wu, 1996] define weighted transduction rules to represent translation and reordering operations. When translation models use features that are defined locally, on each rule, there are efficient dynamic programming algor ..."
Abstract
- Add to MetaCart
Probabilistic Synchronous Context-free Grammars (PSCFGs) [Aho and Ullmann, 1969, Wu, 1996] define weighted transduction rules to represent translation and reordering operations. When translation models use features that are defined locally, on each rule, there are efficient dynamic programming algorithms to perform translation with these grammars [Kasami, 1965]. In general, the integration of non-local features into the translation model can make translation NP-hard, requiring decoding approximations that limit the impact of these features. In this thesis, we consider the impact and interaction between two non-local features, the n-gram language model (LM) and labels on rule nonterminal symbols in the Syntax-Augmented MT (SAMT) grammar [Zollmann and Venugopal, 2006]. While these features do not result in NP-hard search, they would lead to serious increases in wall-clock runtime if naïve dynamic programming methods are applied. We develop novel two-pass algorithms that make strong decoding approximations during a first pass search, generating a hypergraph of sentence spanning translation i derivations. In a second pass, we use knowledge about non-local features to explore
UPM system for the translation task Verónica López-Ludeña Grupo
"... This paper describes the UPM system for translation task at the EMNLP 2011 workshop on statistical machine translation ..."
Abstract
- Add to MetaCart
This paper describes the UPM system for translation task at the EMNLP 2011 workshop on statistical machine translation
Source Language Categorization for improving a Speech into Sign Language Translation System
"... This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words ..."
Abstract
- Add to MetaCart
This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words have been studied. Nonrelevant words are Spanish words not relevant in the translation process. The categorization module has been incorporated into a phrase-based system and a Statistical Finite State Transducer (SFST). The evaluation results reveal that the BLEU has increased from 69.11 % to 78.79 % for the phrase-based system and from 69.84 % to 75.59 % for the SFST.
Video Audio Interface for Recognizing Gestures of Indian Sign Language
, 2011
"... We proposed a system to automatically recognize gestures of sign language from a video stream of the signer. The developed system converts words and sentences of Indian sign language into voice and text in English. We have used the power of image processing techniques and artificial intelligence tec ..."
Abstract
- Add to MetaCart
We proposed a system to automatically recognize gestures of sign language from a video stream of the signer. The developed system converts words and sentences of Indian sign language into voice and text in English. We have used the power of image processing techniques and artificial intelligence techniques to achieve the objective. To accomplish the task we used powerful image processing techniques such as frame differencing based tracking, edge detection, wavelet transform, image fusion techniques to segment shapes in our videos. It also uses Elliptical Fourier descriptors for shape feature extraction and principal component analysis for feature set optimization and reduction. Database of extracted features are compared with input video of the signer using a trained fuzzy inference system. The proposed system converts gestures into a text and voice message with 91 percent accuracy. The training and testing of the system is done using gestures from Indian Sign Language (INSL). Around 80 gestures from 10 different signers are used. The entire system was developed in a user friendly environment by creating a graphical user interface in MATLAB. The system is robust and can be trained for new gestures using GUI.

