• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

H.: Fsa: an efficient and flexible C ++ toolkit for finite state automata using on-demand computation (0)

by S Kanthak, Ney
Venue:In: Proc. ACL. (2004) 510–517
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 10

Openfst: a general and efficient weighted finite-state transducer library

by Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, Mehryar Mohri - in Proceedings of the Ninth International Conference on Implementation and Application of Automata, (CIAA 2007 , 2007
"... Abstract. We describe OpenFst, an open-source library for weighted finite-state transducers (WFSTs). OpenFst consists of a C++ template library with efficient WFST representations and over twenty-five operations for constructing, combining, optimizing, and searching them. At the shell-command level, ..."
Abstract - Cited by 36 (4 self) - Add to MetaCart
Abstract. We describe OpenFst, an open-source library for weighted finite-state transducers (WFSTs). OpenFst consists of a C++ template library with efficient WFST representations and over twenty-five operations for constructing, combining, optimizing, and searching them. At the shell-command level, there are corresponding transducer file representations and programs that operate on them. OpenFst is designed to be both very efficient in time and space and to scale to very large problems. This library has key applications speech, image, and natural language processing, pattern and string matching, and machine learning. We give an overview of the library, examples of its use, details of its design that allow customizing the labels, states, and weights and the lazy evaluation of many of its operations. Further information and a download of the OpenFst library can be obtained from

On the integration of speech recognition and statistical machine translation

by E. Matusov, S. Kanthak, H. Ney - Proc. European Conf. on Speech Communication and Technology , 2005
"... This paper focuses on the interface between speech recognition and machine translation in a speech translation system. Based on a thorough theoretical framework, we exploit word lattices of automatic speech recognition hypotheses as input to our translation system which is based on weighted finite-s ..."
Abstract - Cited by 9 (0 self) - Add to MetaCart
This paper focuses on the interface between speech recognition and machine translation in a speech translation system. Based on a thorough theoretical framework, we exploit word lattices of automatic speech recognition hypotheses as input to our translation system which is based on weighted finite-state transducers. We show that acoustic recognition scores of the recognized words in the lattices positively and significantly affect the translation quality. In experiments, we have found consistent improvements on three different corpora in comparison with translations of single best recognized results. In addition we build and evaluate a fully integrated speech translation model. 1.

Integrated Chinese Word Segmentation in Statistical Machine Translation

by Jia Xu, Evgeny Matusov, Richard Zens, Hermann Ney
"... A Chinese sentence is represented as a sequence of characters, and words are not separated from each other. In statistical machine translation, the conventional approach is to segment the Chinese character sequence into words during the pre-processing. The training and translation are performed afte ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
A Chinese sentence is represented as a sequence of characters, and words are not separated from each other. In statistical machine translation, the conventional approach is to segment the Chinese character sequence into words during the pre-processing. The training and translation are performed afterwards. However, this method is not optimal for two reasons: 1. The segmentations may be erroneous. 2. For a given character sequence, the best segmentation depends on its context and translation. In order to minimize the translation errors, we take different segmentation alternatives instead of a single segmentation into account and integrate the segmentation process with the search for the best translation. The segmentation decision is only taken during the generation of the translation. With this method we are able to translate Chinese text at the character level. The experiments on the IWSLT 2005 task showed improvements in the translation performance using two translation systems: a phrase-based system and a finite state transducer based system. For the phrase-based system, the improvement of the BLEU score is 1.5 % absolute. 1.

Modified mpe/mmi in a transducer-based framework

by G. Heigold, R. Schlüter, H. Ney - in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing , 2009
"... In this paper we show how common training criteria like for example MPE or MMI can be extended to incorporate a margin term. In addition, a transducer-based training implementation is presented, which covers a large variety of discriminative training criteria for ASR, including the standard MMI, MPE ..."
Abstract - Cited by 4 (4 self) - Add to MetaCart
In this paper we show how common training criteria like for example MPE or MMI can be extended to incorporate a margin term. In addition, a transducer-based training implementation is presented, which covers a large variety of discriminative training criteria for ASR, including the standard MMI, MPE, and MCE criteria, as well as the modifications to these criteria presented here. The modified criteria are directly related with the conventional large margin formulation of SVMs. In the proposed approach, we can take advantage of the generalization guarantees of large margin classifiers while keeping the existing framework for the discriminative training, including the efficient algorithms for conventional MPE or MMI. On the conceptual side, this allows for a direct evaluation of the margin term. Finally, experimental results are presented for different large vocabulary continuous speech recognition tasks (one of which is trained on a very large amount of training data) using these modified criteria. Index Terms — training criteria, large margin, weighted finite state transducer, speech recognition 1.

Juicer: A Weighted Finite-State Transducer speech decoder

by Darren Moore, John Dines, Mathew Magimai Doss, Jithendra Vepa, Octavian Cheng, Thomas Hain
"... Abstract. A major component in the development of any speech recognition system is the decoder. As task complexities and, consequently, system complexities have continued to increase the decoding problem has become an increasingly significant component in the overall speech recognition system develo ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
Abstract. A major component in the development of any speech recognition system is the decoder. As task complexities and, consequently, system complexities have continued to increase the decoding problem has become an increasingly significant component in the overall speech recognition system development effort, with efficient decoder design contributing to significantly improve the trade-off between decoding time and search errors. In this paper we present the“Juicer”(from transducer) large vocabulary continuous speech recognition (LVCSR) decoder based on weighted finite-State transducer (WFST). We begin with a discussion of the need for open source, state-of-the-art decoding software in LVCSR research and how this lead to the development of Juicer, followed by a brief overview of decoding techniques and major issues in decoder design. We present Juicer and its major features, emphasising its potential not only as a critical component in the development of LVCSR systems, but also as an important research tool in itself, being based around the flexible WFST paradigm. We also provide results of benchmarking tests that have been carried out to date, demonstrating that in many respects Juicer, while still in its early development, is already achieving stateof-the-art. These benchmarking tests serve to not only demonstrate the utility of Juicer in its present state, but are also being used to guide future development, hence, we conclude with a brief discussion of some of the extensions that are currently under way or being considered for Juicer. 1

The RWTH Aachen University Open Source Speech Recognition System

by David Rybach, Christian Gollan, Georg Heigold, Björn Hoffmeister, Jonas Lööf, Ralf Schlüter, Hermann Ney
"... We announce the public availability of the RWTH Aachen University speech recognition toolkit. The toolkit includes state of the art speech recognition technology for acoustic model training and decoding. Speaker adaptation, speaker adaptive training, unsupervised training, a finite state automata li ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
We announce the public availability of the RWTH Aachen University speech recognition toolkit. The toolkit includes state of the art speech recognition technology for acoustic model training and decoding. Speaker adaptation, speaker adaptive training, unsupervised training, a finite state automata library, and an efficient tree search decoder are notable components. Comprehensive documentation, example setups for training and recognition, and a tutorial are provided to support newcomers. Index Terms: speech recognition, LVCSR, software 1.

Automatic Sentence Structure Annotation for Spoken Language Processing

by Dustin Lundring Hillard , 2008
"... Increasing amounts of easily available electronic data are precipitating a need for automatic processing that can aid humans in digesting large amounts of data. Speech and video are becoming an increasingly significant portion of on-line information, from news and television broadcasts, to oral hist ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Increasing amounts of easily available electronic data are precipitating a need for automatic processing that can aid humans in digesting large amounts of data. Speech and video are becoming an increasingly significant portion of on-line information, from news and television broadcasts, to oral histories, on-line lectures, or user generated content. Automatic processing of audio and video sources requires automatic speech recognition (ASR) in order to provide transcripts. Typical ASR generates only words, without punctuation, capitalization, or further structure. Many techniques available from natural language processing therefore suffer when applied to speech recognition output, because they assume the presence of reliable punctuation and structure. In addition, errors from automatic transcription also degrade the performance of downstream processing such as machine translation, name detection, or information retrieval. We develop approaches for automatically annotating structure in speech, including sentence and sub-sentence segmentation, and then turn towards optimizing ASR and annotation for downstream applications.

Compositions of Top-down Tree Transducers with "-rules

by Andreas Maletti, Heiko Vogler
"... Abstract. Top-down tree transducers with "-rules ("tdtt) are a restricted version of extended top-down tree transducers. They are implemented in the framework Tiburon and ful ll some criteria desirable in a machine translation model. However, they compute a class of transformations that is not close ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract. Top-down tree transducers with "-rules ("tdtt) are a restricted version of extended top-down tree transducers. They are implemented in the framework Tiburon and ful ll some criteria desirable in a machine translation model. However, they compute a class of transformations that is not closed under composition (not even for linear and nondeleting "tdtt). A composition construction that composes "tdtt M and N is presented. It is correct whenever (i) M has at most one output symbol in each rule, (ii) M is deterministic or N is linear, and (iii) M is total or N is nondeleting. This corresponds nicely to a classical composition result by Baker. 1

unknown title

by unknown authors , 2007
"... Strengths and weaknesses of finite-state technology: a case study in morphological grammar development ..."
Abstract - Add to MetaCart
Strengths and weaknesses of finite-state technology: a case study in morphological grammar development

Signal Speech

by David Rybach, Lehrstuhl Für Informatik, Archiver Corpusstatistics, Featureextraction Featurestatistics
"... ◮ Further development by several PhD students at i6 ◮ Today: standard system for all ASR research topics and projects ◮ Very flexible and extendable ◮ Framework also used for machine translation, video / image processing ..."
Abstract - Add to MetaCart
◮ Further development by several PhD students at i6 ◮ Today: standard system for all ASR research topics and projects ◮ Very flexible and extendable ◮ Framework also used for machine translation, video / image processing
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University