Results 1 
5 of
5
pruned or continuous space language models on a gpu for statistical machine translation
 In Proceedings of NAACLHLT 2012 Workshop: Will We Ever Really Replace the Ngram Model? On the Future of Language Modeling for HLT
"... Language models play an important role in large vocabulary speech recognition and statistical machine translation systems. The dominant approach since several decades are backoff language models. Some years ago, there was a clear tendency to build huge language models trained on hundreds of billion ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Language models play an important role in large vocabulary speech recognition and statistical machine translation systems. The dominant approach since several decades are backoff language models. Some years ago, there was a clear tendency to build huge language models trained on hundreds of billions of words. Lately, this tendency has changed and recent works concentrate on data selection. Continuous space methods are a very competitive approach, but they have a high computational complexity and are not yet in widespread use. This paper presents an experimental comparison of all these approaches on a large statistical machine translation task. We also describe an opensource implementation to train and use continuous space language models (CSLM) for such large tasks. We describe an efficient implementation of the CSLM using graphical processing units from Nvidia. By these means, we are able to train an CSLM on more than 500 million words in 20 hours. This CSLM provides an improvement of up to 1.8 BLEU points with respect to the best backoff language model that we were able to build. 1
LSTM Neural Networks for Language Modeling
"... Neural networks have become increasingly popular for the task of language modeling. Whereas feedforward networks only exploit a fixed context length to predict the next word of a sequence, conceptually, standard recurrent neural networks can take into account all of the predecessor words. On the ot ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Neural networks have become increasingly popular for the task of language modeling. Whereas feedforward networks only exploit a fixed context length to predict the next word of a sequence, conceptually, standard recurrent neural networks can take into account all of the predecessor words. On the other hand, it is well known that recurrent networks are difficult to train and therefore are unlikely to show the full potential of recurrent models. These problems are addressed by a the Long ShortTerm Memory neural network architecture. In this work, we analyze this type of network on an English and a large French language modeling task. Experiments show improvements of about 8 % relative in perplexity over standard recurrent neural network LMs. In addition, we gain considerable improvements in WER on top of a stateoftheart speech recognition system. Index Terms: language modeling, recurrent neural networks, LSTM neural networks
Converting ContinuousSpace Language Models into Ngram Language Models for Statistical Machine Translation
"... Neural network language models, or continuousspace language models (CSLMs), have been shown to improve the performance of statistical machine translation (SMT) when they are used for reranking nbest translations. However, CSLMs have not been used in the first pass decoding of SMT, because using CS ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Neural network language models, or continuousspace language models (CSLMs), have been shown to improve the performance of statistical machine translation (SMT) when they are used for reranking nbest translations. However, CSLMs have not been used in the first pass decoding of SMT, because using CSLMs in decoding takes a lot of time. In contrast, we propose a method for converting CSLMs into backoff ngram language models (BNLMs) so that we can use converted CSLMs in decoding. We show that they outperform the original BNLMs and are comparable with the traditional use of CSLMs in reranking. 1
Continuous Space Translation Models for PhraseBased Statistical Machine Translation
"... This paper presents a new approach to perform the estimation of the translation model probabilities of a phrasebased statistical machine translation system. We use neural networks to directly learn the translation probability of phrase pairs using continuous representations. The system can be easil ..."
Abstract
 Add to MetaCart
(Show Context)
This paper presents a new approach to perform the estimation of the translation model probabilities of a phrasebased statistical machine translation system. We use neural networks to directly learn the translation probability of phrase pairs using continuous representations. The system can be easily trained on the same data used to build standard phrasebased systems. We provide experimental evidence that the approach seems to be able to infer meaningful translation probabilities for phrase pairs not seen in the training data, or even predict a list of the most likely translations given a source phrase. The approach can be used to rescore nbest lists, but we also discuss an integration into the Moses decoder. A preliminary evaluation on the English/French IWSLT task achieved improvements in the BLEU score and a human analysis showed that the new model often chooses semantically better translations. Several extensions of this work are discussed.