Results 1  10
of
30
Spectral Learning of General Weighted Automata via Constrained Matrix Completion
"... Many tasks in text and speech processing and computational biology require estimating functions mapping strings to real numbers. A broad class of such functions can be defined by weighted automata. Spectral methods based on the singular value decomposition of a Hankel matrix have been recently propo ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
Many tasks in text and speech processing and computational biology require estimating functions mapping strings to real numbers. A broad class of such functions can be defined by weighted automata. Spectral methods based on the singular value decomposition of a Hankel matrix have been recently proposed for learning a probability distribution represented by a weighted automaton from a training sample drawn according to this same target distribution. In this paper, we show how spectral methods can be extended to the problem of learning a general weighted automaton from a sample generated by an arbitrary distribution. The main obstruction to this approach is that, in general, some entries of the Hankel matrix may be missing. We present a solution to this problem based on solving a constrained matrix completion problem. Combining these two ingredients, matrix completion and spectral method, a whole new family of algorithms for learning general weighted automata is obtained. We present generalization bounds for a particular algorithm in this family. The proofs rely on a joint stability analysis of matrix completion and spectral learning. 1
Why Synchronous Tree Substitution Grammars?
"... Synchronous tree substitution grammars are a translation model that is used in syntaxbased machine translation. They are investigated in a formal setting and compared to a competitor that is at least as expressive. The competitor is the extended multi bottomup tree transducer, which is the bottom ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
Synchronous tree substitution grammars are a translation model that is used in syntaxbased machine translation. They are investigated in a formal setting and compared to a competitor that is at least as expressive. The competitor is the extended multi bottomup tree transducer, which is the bottomup analogue with one essential additional feature. This model has been investigated in theoretical computer science, but seems widely unknown in natural language processing. The two models are compared with respect to standard algorithms (binarization, regular restriction, composition, application). Particular attention is paid to the complexity of the algorithms. 1
Hierarchical PhraseBased Translation Representations
"... This paper compares several translation representations for a synchronous contextfree grammar parse including CFGs/hypergraphs, finitestate automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortestpa ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
(Show Context)
This paper compares several translation representations for a synchronous contextfree grammar parse including CFGs/hypergraphs, finitestate automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortestpath algorithms that follow. Intersection, shortest path, FSA expansion and RTN replacement algorithms are presented for PDAs. ChinesetoEnglish translation experiments using HiFST and HiPDT, FSA and PDAbased decoders, are presented using admissible (or exact) search, possible for HiFST with compact SCFG rulesets and HiPDT with compact LMs. For large rulesets with large LMs, we introduce a twopass search strategy which we then analyze in terms of search errors and translation performance. 1
Efficient Inference Through Cascades of Weighted Tree Transducers
, 2010
"... Weighted tree transducers have been proposed as useful formal models for representing syntactic natural language processing applications, but there has been little description of inference algorithms for these automata beyond formal foundations. We give a detailed description of algorithms for appli ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
Weighted tree transducers have been proposed as useful formal models for representing syntactic natural language processing applications, but there has been little description of inference algorithms for these automata beyond formal foundations. We give a detailed description of algorithms for application of cascades of weighted tree transducers to weighted tree acceptors, connecting formal theory with actual practice. Additionally, we present novel onthefly variants of these algorithms, and compare their performance on a syntax machine translation cascade based on (Yamada and Knight, 2001).
Distributed Optimal Planning: an Approach by Weighted Automata Calculus
"... Abstract — We consider a distributed system modeled as a possibly large network of automata. Planning in this system consists in selecting and organizing actions in order to reach a goal state in an optimal manner, assuming actions have a cost. To cope with the complexity of the system, we propose a ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
(Show Context)
Abstract — We consider a distributed system modeled as a possibly large network of automata. Planning in this system consists in selecting and organizing actions in order to reach a goal state in an optimal manner, assuming actions have a cost. To cope with the complexity of the system, we propose a distributed/modular planning approach. In each automaton or component, an agent explores local action plans that reach the local goal. The agents have to coordinate their search in order to select local plans that 1 / can be assembled into a valid global plan and 2 / ensure the optimality of this global plan. The proposed solution takes the form of a message passing algorithm, of peertopeer nature: no coordinator is needed. We show that local plan selections can be performed by combining operations on weighted languages, and then propose a more practical implementation in terms of weighted automata calculus. Index Terms — factored planning, distributed planning, optimal planning, discrete event system, distributed constraint solving, distributed optimization, weighted automaton, Kautomaton, string to weight transducer, formal language theory I.
More Than Words: Using Token Context to Improve Canonicalization Of Historical German
 JOURNAL FOR LANGUAGE TECHNOLOGY AND COMPUTATIONAL LINGUISTICS
, 2010
"... Historical text presents numerous challenges for
contemporary natural language processing techniques. In particular, the absence of consistent orthographic conventions in historical text presents difficulties for any system requiring reference to a static lexicon indexed by orthographic form. Canon ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Historical text presents numerous challenges for
contemporary natural language processing techniques. In particular, the absence of consistent orthographic conventions in historical text presents difficulties for any system requiring reference to a static lexicon indexed by orthographic form. Canonicalization approaches seek to address these issues by associating one or more extant ``canonical cognates'' with each word of the input text and deferring application analysis to these canonical forms. Typewise conflation techniques treating each input word in isolation often suffer from a pronounced precision  recall tradeoff pattern: highprecision techniques such as conservative transliteration have comparatively poor recall, whereas highrecall techniques such as phonetic conflation tend to be disappointingly imprecise. In this paper, I present a technique for disambiguation of type conflation sets at the token level using a Hidden Markov Model whose lexical probability matrix is dynamically computed from the candidate conflations, and evaluate its performance on a manually annotated corpus of historical German.
Formatting TimeAligned ASR Transcripts for Readability, in
 Proceedings of the Human Language Technology: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics
, 2010
"... Abstract We address the problem of formatting the output of an automatic speech recognition (ASR) system for readability, while preserving wordlevel timing information of the transcript. Our system enriches the ASR transcript with punctuation, capitalization and properly written dates, times and ot ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract We address the problem of formatting the output of an automatic speech recognition (ASR) system for readability, while preserving wordlevel timing information of the transcript. Our system enriches the ASR transcript with punctuation, capitalization and properly written dates, times and other numeric entities, and our approach can be applied to other formatting tasks. The method we describe combines handcrafted grammars with a classbased language model trained on written text and relies on Weighted Finite State Transducers (WFSTs) for the preservation of start and end time of each word.
Decision Problems for Additive Regular Functions
, 2013
"... Additive Cost Register Automata (ACRA) map strings to integers using a finite set of registers that are updated using assignments of the form “x: = y + c ” at every step. The corresponding class of additive regular functions has multiple equivalent characterizations, appealing closure properties, a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Additive Cost Register Automata (ACRA) map strings to integers using a finite set of registers that are updated using assignments of the form “x: = y + c ” at every step. The corresponding class of additive regular functions has multiple equivalent characterizations, appealing closure properties, and a decidable equivalence problem. In this paper, we solve two decision problems for this model. First, we define the register complexity of an additive regular function to be the minimum number of registers that an ACRA needs to compute it. We characterize the register complexity by a necessary and sufficient condition regarding the largest subset of registers whose values can be made far apart from one another. We then use this condition to design a pspace algorithm to compute the register complexity of a given ACRA, and establish a matching lower bound. Our results also lead to a machineindependent characterization of the register complexity of additive regular functions. Second, we consider twoplayer games over ACRAs, where the objective of one of the players is to reach a target set while minimizing the cost. We show the corresponding decision problem to be exptimecomplete when the costs are nonnegative integers, but undecidable when the costs are integers.
Minimum error rate training semiring
 In Proceedings of the European Association for Machine Translation
, 2011
"... Modern Statistical Machine Translation (SMT) systems make their decisions based on multiple information sources, which assess various aspects of the match between a source sentence and its possible translation(s). Tuning a SMT system consists in finding the right balance between these sources so a ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Modern Statistical Machine Translation (SMT) systems make their decisions based on multiple information sources, which assess various aspects of the match between a source sentence and its possible translation(s). Tuning a SMT system consists in finding the right balance between these sources so as to produce the best possible output, and is usually achieved through Minimum Error Rate Training (MERT) (Och, 2003). In this paper, we recast the operations implied in MERT in the terms of operations over a specific semiring, which, in particular, enables us to derive a simple and generic implementation of MERT over word lattices. 1
Failure transitions for Joint ngram Models and G2P Conversion
"... This work investigates two related issues in the area of WFSTbased G2P conversion. The first is the impact that the approach utilized to convert a target word to an equivalent finitestate machine has on downstream decoding efficiency. The second issue considered is the impact that the approach ut ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This work investigates two related issues in the area of WFSTbased G2P conversion. The first is the impact that the approach utilized to convert a target word to an equivalent finitestate machine has on downstream decoding efficiency. The second issue considered is the impact that the approach utilized to represent the joint ngram model via the WFST framework has on the speed and accuracy of the system. In the latter case two novel algorithms are proposed, which extend the work from [1] to enable the use of failuretransitions with joint ngram models. All solutions presented in this work are available as part of the opensource, BSDlicensed Phonetisaurus G2P toolkit [2]. Index Terms: G2P, WFST, model conversion