Results 1 -
8 of
8
Finite-State Transducers For Speech-Input Translation
- IEEE Automatic Speech Recognition and Understanding Workhsop, ASRU’01
, 2001
"... Nowadays, hidden Markov models (HMMs) and n-grams are the basic components of the most successful speech recognition systems. In such systems, HMMs (the acoustic models) are integrated into a n-gram or a stochastic finite-state grammar (the language model). Similar models can be used for speech tra ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Nowadays, hidden Markov models (HMMs) and n-grams are the basic components of the most successful speech recognition systems. In such systems, HMMs (the acoustic models) are integrated into a n-gram or a stochastic finite-state grammar (the language model). Similar models can be used for speech translation, and HMMs (the acoustic models) can be integrated into a finite-state transducer (the translation model). Moreover, the translation process can be performed by searching for an optimal path of states in the integrated network. The output of this search process is a target word sequence associated to the optimal path. In speech translation, HMMs can be trained from a source speech corpus, and the translation model can be learned automatically from a parallel training corpus.
Probabilistic Finite-State Machines - Part I
"... Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translatio ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translation are some of them. In part I of this paper we survey these generative objects and study their definitions and properties. In part II, we will study the relation of probabilistic finite-state automata with other well known devices that generate strings as hidden Markov models and n-grams, and provide theorems, algorithms and properties that represent a current state of the art of these objects.
2004. Translation memories enrichment by statistical bilingual segmentation
- In Proceedings of LREC-04
"... A majority of Machine Aided Translation systems are based on comparisons between a source sentence and reference sentences stored in Translation Memories (TMs). The translation search is done by looking for sentences in a database which are similar to the source sentence. TMs have two basic limitati ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
A majority of Machine Aided Translation systems are based on comparisons between a source sentence and reference sentences stored in Translation Memories (TMs). The translation search is done by looking for sentences in a database which are similar to the source sentence. TMs have two basic limitations: the dependency on the repetition of complete sentences and the high cost of building a TM. As human translators do not only remember sentences from their preceding translations, but they also decompose the sentence to be translated and work with smaller units, it would be desirable to enrich the TM database with smaller translation units. This enrichment should also be automatic in order not to increase the cost of building a TM. We propose the application of two automatic bilingual segmentation techniques based on statistical translation methods in order to create new, shorter bilingual segments to be included in a TM database. An evaluation of the two techniques is carried out for a bilingual Basque-Spanish task. 1.
Combining Phrase-Based and Template-Based alignment models in Statistical Translation
"... Abstract. In statistical machine translation, single-word based models have an important deficiency; they do not take contextual information into account for the translation decision. A possible solution called Phrase-Based, consists in translating a sequence of words instead of a single word. We sh ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In statistical machine translation, single-word based models have an important deficiency; they do not take contextual information into account for the translation decision. A possible solution called Phrase-Based, consists in translating a sequence of words instead of a single word. We show how this approach obtains interesting results in some corpora. One shortcoming of the phrase-based alignment models is that they do not have the generalization capability in word reordering. A possible solution could be the template-based approach, which uses sequences of classes of words instead of sequences of words. We present a template-based alignment model that uses a Part Of Speech tagger for word classes. We also propose an improved model that combines both models. The basic idea is that if a sequence of words has been seen in training, the phrase-based model can be used; otherwise, the template-based model can be used. We present the results from different tasks. 1
A finite-state framework for log-linear models in Machine Translation 1
"... Abstract. Log-linear models represent nowadays the state-of-the-art in statistical machine translation. There, several models are combined altogether into a whole statistical approach. Finite-state transducers constitute a special type of statistical translation model whose interest has been proved ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Log-linear models represent nowadays the state-of-the-art in statistical machine translation. There, several models are combined altogether into a whole statistical approach. Finite-state transducers constitute a special type of statistical translation model whose interest has been proved in different translation tasks. The goal of this work is to introduce a finite-state framework for a log-linear modelling approach in statistical machine translation. Results for a French-English technical translation task show the convenience of the proposed methods. 1
Speech-To-Speech Translation Based On Finite-State Transducers
- In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing
, 2001
"... Nowadays, the most successful speech recognition systems are based on stochastic finite-state networks (hidden Markov models and n-grams). Speech translation can be accomplished in a similar way as speech recognition. Stochastic finite-state transducers, which are specific stochastic finitestate net ..."
Abstract
- Add to MetaCart
Nowadays, the most successful speech recognition systems are based on stochastic finite-state networks (hidden Markov models and n-grams). Speech translation can be accomplished in a similar way as speech recognition. Stochastic finite-state transducers, which are specific stochastic finitestate networks, have proved very adequate for translation modeling. In this work a speech-to-speech translation system, the EUTRANS system, is presented. The acoustic, language and translation models are finite-state networks that are automatically learnt from training samples. This system was assessed in a series of translation experiments from Spanish to English and from Italian to English in an application involving the interaction (by telephone) of a customer with a receptionist at the front-desk of a hotel.
Bilingual Corpora Segmentation Using Bilingual Recursive Alignments
"... A bilingual recursive alignment is a set of nested phrase-based alignments. This set is represented through a binary tree where the inner nodes can model a direct or an inverse translation. In this work, we compute recursive alignments using a greedy algorithm based on a statistical translation dict ..."
Abstract
- Add to MetaCart
A bilingual recursive alignment is a set of nested phrase-based alignments. This set is represented through a binary tree where the inner nodes can model a direct or an inverse translation. In this work, we compute recursive alignments using a greedy algorithm based on a statistical translation dictionary. Different bilingual segmentations can be obtained from a bilingual recursive alignment of a pair of sentences, depending on the degree of granularity to be achieved. Bilingual segmentation results are reported for a Spanish-English task and a Spanish-Basque task. 1.
A Spectral Learning Algorithm for Finite State
"... Abstract. Finite-State Transducers (FSTs) are a popular tool for modeling paired input-output sequences, and have numerous applications in real-world problems. Most training algorithms for learning FSTs rely on gradient-based or EM optimizations which can be computationally expensive and suffer from ..."
Abstract
- Add to MetaCart
Abstract. Finite-State Transducers (FSTs) are a popular tool for modeling paired input-output sequences, and have numerous applications in real-world problems. Most training algorithms for learning FSTs rely on gradient-based or EM optimizations which can be computationally expensive and suffer from local optima issues. Recently, Hsu et al. [13] proposed a spectral method for learning Hidden Markov Models (HMMs) which is based on an Observable Operator Model (OOM) view of HMMs. Following this line of work we present a spectral algorithm to learn FSTs with strong PAC-style guarantees. To the best of our knowledge, ours is the first result of this type for FST learning. At its core, the algorithm is simple, and scalable to large data sets. We present experiments that validate the effectiveness of the algorithm on synthetic and real data. 1

