Results 1  10
of
97
The Design Principles of a Weighted FiniteState Transducer Library
 THEORETICAL COMPUTER SCIENCE
, 2000
"... We describe the algorithmic and software design principles of an objectoriented library for weighted finitestate transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive eff ..."
Abstract

Cited by 99 (23 self)
 Add to MetaCart
We describe the algorithmic and software design principles of an objectoriented library for weighted finitestate transducers. By taking advantage of the theory of rational power series, we were able to achieve high degrees of generality, modularity and irredundancy, while attaining competitive efficiency in demanding speech processing applications involving weighted automata of more than 10^7 states and transitions. Besides its mathematical foundation, the design also draws from important ideas in algorithm design and programming languages: dynamic programming and shortestpaths algorithms over general semirings, objectoriented programming, lazy evaluation and memoization.
A Rational Design for a Weighted FiniteState Transducer Library
 LECTURE NOTES IN COMPUTER SCIENCE
, 1998
"... ..."
A Spectral Algorithm for Learning Hidden Markov Models
"... Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard; practitioners typically resort to search heuristics (such as the BaumWelch / EM algorithm) which suffer from ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard; practitioners typically resort to search heuristics (such as the BaumWelch / EM algorithm) which suffer from the usual local optima issues. We prove that under a natural separation condition (roughly analogous to those considered for learning mixture models), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations—it implicitly depends on this number through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple: it employs only a singular value decomposition and matrix multiplications. 1
Minor Identities For QuasiDeterminants And Quantum Determinants
 COMM. MATH. PHYS
, 1994
"... We present several identities involving quasiminors of noncommutative generic matrices. These identities are specialized to quantum matrices, yielding qanalogues of various classical determinantal formulas. ..."
Abstract

Cited by 40 (4 self)
 Add to MetaCart
We present several identities involving quasiminors of noncommutative generic matrices. These identities are specialized to quantum matrices, yielding qanalogues of various classical determinantal formulas.
The Ring of kRegular Sequences
, 1992
"... The automatic sequence is the central concept at the intersection of formal language theory and number theory. It was introduced by Cobham, and has been extensively studied by Christol, Kamae, Mendes France and Rauzy, and other writers. Since the range of automatic sequences is nite, however, their ..."
Abstract

Cited by 39 (7 self)
 Add to MetaCart
The automatic sequence is the central concept at the intersection of formal language theory and number theory. It was introduced by Cobham, and has been extensively studied by Christol, Kamae, Mendes France and Rauzy, and other writers. Since the range of automatic sequences is nite, however, their descriptive power is severely limited.
Weighted automata and weighted logics
 In Automata, Languages and Programming – 32nd International Colloquium, ICALP 2005
, 2005
"... Abstract. Weighted automata are used to describe quantitative properties in various areas such as probabilistic systems, image compression, speechtotext processing. The behaviour of such an automaton is a mapping, called a formal power series, assigning to each word a weight in some semiring. We g ..."
Abstract

Cited by 39 (7 self)
 Add to MetaCart
Abstract. Weighted automata are used to describe quantitative properties in various areas such as probabilistic systems, image compression, speechtotext processing. The behaviour of such an automaton is a mapping, called a formal power series, assigning to each word a weight in some semiring. We generalize Büchi’s and Elgot’s fundamental theorems to this quantitative setting. We introduce a weighted version of MSO logic and prove that, for commutative semirings, the behaviours of weighted automata are precisely the formal power series definable with our weighted logic. We also consider weighted firstorder logic and show that aperiodic series coincide with the firstorder definable ones, if the semiring is locally finite, commutative and has some aperiodicity property. 1
Quantitative languages
"... Quantitative generalizations of classical languages, which assign to each word a real number instead of a boolean value, have applications in modeling resourceconstrained computation. We use weighted automata (finite automata with transition weights) to define several natural classes of quantitativ ..."
Abstract

Cited by 36 (14 self)
 Add to MetaCart
Quantitative generalizations of classical languages, which assign to each word a real number instead of a boolean value, have applications in modeling resourceconstrained computation. We use weighted automata (finite automata with transition weights) to define several natural classes of quantitative languages over finite and infinite words; in particular, the real value of an infinite run is computed as the maximum, limsup, liminf, limit average, or discounted sum of the transition weights. We define the classical decision problems of automata theory (emptiness, universality, language inclusion, and language equivalence) in the quantitative setting and study their computational complexity. As the decidability of the languageinclusion problem remains open for some classes of weighted automata, we introduce a notion of quantitative simulation that is decidable and implies language inclusion. We also give a complete characterization of the expressive power of the various classes of weighted automata. In particular, we show that most classes of weighted
Generalized Algorithms for Constructing Statistical Language Models
 IN PROC. OF THE 41ST MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 2003
"... Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in detail several new and efficient algorithms to address these more general problems and report experimental results demon ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in detail several new and efficient algorithms to address these more general problems and report experimental results demonstrating their usefulness. We give an algorithm for computing efficiently the expected counts of any sequence in a word lattice output by a speech recognizer or any arbitrary weighted automaton; describe a new technique for creating exact representations gram language models by weighted automata whose size is practical for offline use even for a vocabulary size of about 500,000 words and an ngram order n = 6; and present a simple and more general technique for constructing classbased language models that allows each class to represent an arbitrary weighted automaton. An efficient implementation of our algorithms and techniques has been incorporated in a general software library for language modeling, the GRM Library, that includes many other text and grammar processing functionalities.
Compositional Analysis of Expected Delays in Networks of Probabilistic I/O Automata
, 1998
"... Probabilistic I/O automata (PIOA) constitute a model for distributed or concurrent systems that incorporates a notion of probabilistic choice. The PIOA model provides a notion of composition, for constructing a PIOA for a composite system from a collection of PIOAs representing the components. We pr ..."
Abstract

Cited by 17 (8 self)
 Add to MetaCart
Probabilistic I/O automata (PIOA) constitute a model for distributed or concurrent systems that incorporates a notion of probabilistic choice. The PIOA model provides a notion of composition, for constructing a PIOA for a composite system from a collection of PIOAs representing the components. We present a method for computing completion probability and expected completion time for PIOAs. Our method is compositional, in the sense that it can be applied to a system of PIOAs, one component at a time, without ever calculating the global state space of the system (i.e. the composite PIOA). The method is based on symbolic calculations with vectors and matrices of rational functions, and it draws upon a theory of observables, which are mappings from delayed traces to real numbers that generalize the classical "formal power series " from algebra and combinatorics. Central to the theory is a notion of representation for an observable, which generalizes the clasical notion "linear representation " for formal power series. As in the classical case, the representable observables coincide with an abstractly defined class of "rational" observables; this fact forms the foundation of our method. 1
A Kleene theorem for weighted tree automata
 Theory of Computing Systems
, 2002
"... In this paper we prove Kleene's result for tree series over a commutative and idempotent semiring A (which is not necessarily complete or continuous), i.e., the class of recognizable tree series over A and the class of rational tree series over A are equal. We show the result by direct automatatheo ..."
Abstract

Cited by 17 (8 self)
 Add to MetaCart
In this paper we prove Kleene's result for tree series over a commutative and idempotent semiring A (which is not necessarily complete or continuous), i.e., the class of recognizable tree series over A and the class of rational tree series over A are equal. We show the result by direct automatatheoretic constructions and prove their correctness.