Results 1  10
of
12
FiniteState Transducers in Language and Speech Processing
 Computational Linguistics
, 1997
"... Finitestate machines have been used in various domains of natural language processing. We consider here the use of a type of transducers that supports very efficient programs: sequential transducers. We recall classical theorems and give new ones characterizing sequential stringtostring transducer ..."
Abstract

Cited by 308 (41 self)
 Add to MetaCart
Finitestate machines have been used in various domains of natural language processing. We consider here the use of a type of transducers that supports very efficient programs: sequential transducers. We recall classical theorems and give new ones characterizing sequential stringtostring transducers. Transducers that output weights also play an important role in language and speech processing. We give a specific study of stringtoweight transducers, including algorithms for determinizing and minimizing these transducers very efficiently, and characterizations of the transducers admitting determinization and the corresponding algorithms. Some applications of these algorithms in speech recognition are described and illustrated. 1.
Speech Recognition by Composition of Weighted Finite Automata
 FINITESTATE LANGUAGE PROCESSING
, 1996
"... We present a general framework based on weighted finite automata and weighted finitestate transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including contextdependent u ..."
Abstract

Cited by 124 (12 self)
 Add to MetaCart
We present a general framework based on weighted finite automata and weighted finitestate transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including contextdependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.
The Design Principles of a Weighted FiniteState Transducer Library
 THEORETICAL COMPUTER SCIENCE
, 2000
"... We describe the algorithmic and software design principles of an objectoriented library for weighted finitestate transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive eff ..."
Abstract

Cited by 99 (23 self)
 Add to MetaCart
We describe the algorithmic and software design principles of an objectoriented library for weighted finitestate transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive efficiency in demanding speech processing applications involving weighted automata of more than 10^7 states and transitions. Besides its mathematical foundation, the design also draws from important ideas in algorithm design and programming languages: dynamic programming and shortestpaths algorithms over general semirings, objectoriented programming, lazy evaluation and memoization.
A Rational Design for a Weighted FiniteState Transducer Library
 LECTURE NOTES IN COMPUTER SCIENCE
, 1998
"... ..."
Rational Transductions for Phonetic Conversion and Phonology
 Finite State Language Processing (MIT
, 1997
"... Phonetic conversion, and other conversion problems related to phonetics, can be performed by finitestate tools. We present a finitestate conversion system, BiPho, based on transducers and bimachines, two mathematical notions borrowed from the theory of rational transductions. The linguistic data u ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Phonetic conversion, and other conversion problems related to phonetics, can be performed by finitestate tools. We present a finitestate conversion system, BiPho, based on transducers and bimachines, two mathematical notions borrowed from the theory of rational transductions. The linguistic data used by this system are described in a readable format and actual computation is efficient. With adequate data, BiPho constitutes the first comprehensive spellingtophonetics conversion system for French to take the form of transducers or bimachines.
A General Weighted Grammar Library
 IN PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON AUTOMATA (CIAA 2004
, 2004
"... We present a general weighted grammar software library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. The underlying algorithms were designed to support a wide variety of semirings and the representation and use of very large grammars ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
We present a general weighted grammar software library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. The underlying algorithms were designed to support a wide variety of semirings and the representation and use of very large grammars and automata of several hundred million rules or transitions. We describe several algorithms and utilities of this library and point out in each case their application to several text and speech processing tasks.
StringMatching With Automata
, 1997
"... . We present an algorithm to search in a text for the patterns of a regular set. Unlike many classical algorithms, we assume that the input of the algorithm is a deterministic automaton and not a regular expression. Our algorithm is based on the notion of failure function and mainly consists of effi ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
. We present an algorithm to search in a text for the patterns of a regular set. Unlike many classical algorithms, we assume that the input of the algorithm is a deterministic automaton and not a regular expression. Our algorithm is based on the notion of failure function and mainly consists of efficiently constructing a new deterministic automaton. This construction is shown to be efficient. In particular, its space complexity is linear in the size of the obtained automaton. Key words: Finite automata, patternmatching, strings. CR Classification: F.1.1, F.2.0, F.2.2, F.4.3 1. Introduction Patternmatching consists of finding the occurrences of a set of strings in a text. Two general approaches have been used to perform this task given a regular expression r describing the patterns. Both require a preprocessing stage, which consists of constructing an automaton representing the set described by the regular expression A r, where A is the alphabet of the text. This automaton is th...
Use of weighted finite state transducers in part of speech tagging
, 1997
"... This paper addresses issues in part of speech disambiguation using finitestate transducers and presents two main contributions to the field. One of them is the use of finitestate machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on trans ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
This paper addresses issues in part of speech disambiguation using finitestate transducers and presents two main contributions to the field. One of them is the use of finitestate machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finitestate transducers. Another contribution is the successful combination of techniques – linguistic and statistical – for word disambiguation, compounded with the notion of word classes.
Joint work with
, 1996
"... Text and speech processing: hard problems Theory of automata Appropriate level of abstraction ..."
Abstract
 Add to MetaCart
Text and speech processing: hard problems Theory of automata Appropriate level of abstraction