Results 1 
9 of
9
P.Plamondon Towards Automatic Dictation System for Translators: the TransTalk Project
 Proc ICSLP
, 1994
"... Professional translators often dictate their translations orally and have them typed afterwards. The TransTalk project aims at automating the second part of this process. Its originality as a dictation system lies in the fact that both the acoustic signal produced by the translator and the source te ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
Professional translators often dictate their translations orally and have them typed afterwards. The TransTalk project aims at automating the second part of this process. Its originality as a dictation system lies in the fact that both the acoustic signal produced by the translator and the source text under translation are made available to the system. Probable translations of the source text can be predicted and these predictions used to help the speech recognition system in its lexical choices. We present the results of the first prototype, which show a marked improvement in the performance of the speech recognition task when translation predictions are taken into account. 1
Incremental construction of compact acyclic NFAs
 in Proceedings of ACL 2001
, 2001
"... This paper presents and analyzes an incremental algorithm for the construction of Acyclic Nondeterministic Finitestate Automata (NFA). Automata of this type are quite useful in computational linguistics, especially for storing lexicons. The proposed algorithm produces compact NFAs, i.e. NFA ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This paper presents and analyzes an incremental algorithm for the construction of Acyclic Nondeterministic Finitestate Automata (NFA). Automata of this type are quite useful in computational linguistics, especially for storing lexicons. The proposed algorithm produces compact NFAs, i.e. NFAs that do not contain equivalent states. Unlike Deterministic Finitestate Automata (DFA), this property is not sufficient to ensure minimality, but still the resulting NFAs are considerably smaller than the minimal DFAs for the same languages.
How to Squeeze a Lexicon
 Software Practice and Experience
, 2000
"... Minimal acyclic deterministic finite automata (ADFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of strings is involved. This paper aims to popularize an efficient b ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Minimal acyclic deterministic finite automata (ADFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of strings is involved. This paper aims to popularize an efficient but little known algorithm for creating minimal ADFAs recognizing a finite language, developed independently by several authors. The algorithm is presented for three variants of ADFAs, its minor improvements are discussed, and minimal ADFAs are compared to competitive data structures.
Experimental Study Of Finite Automata Storing Static Lexicons
, 1999
"... Minimal acyclic deterministic finite automata (DFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of words is involved. We present ecient algorithm for creating variou ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Minimal acyclic deterministic finite automata (DFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of words is involved. We present ecient algorithm for creating various forms of DFAs recognizing a finite language and compare the DFAs to other techniques.
Reduction of NonDeterministic Automata for Hidden Markov Model Based Pattern Recognition Applications
 In "Advances in Artificial Intelligence
, 2003
"... Abstract. Most online cursive handwriting recognition systems use a lexical constraint to help improve the recognition performance. Traditionally, the vocabulary lexicon is stored in a trie (automaton whose underlying graph is a tree). In a previous paper, we showed that nondeterministic automata ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Most online cursive handwriting recognition systems use a lexical constraint to help improve the recognition performance. Traditionally, the vocabulary lexicon is stored in a trie (automaton whose underlying graph is a tree). In a previous paper, we showed that nondeterministic automata were computationally more efficient than tries. In this paper, we propose a new method for constructing incrementally small nondeterministic automata from lexicons. We present experimental results demonstrating a significant reduction in the number of labels in the automata. This reduction yields a proportional speedup in HMM based lexically constrained pattern recognition systems. 1
Shrinking Language Models by Robust Approximation
 in Proc. IEEE Int'l. Conf. on Acoustics, Speech, and Signal Processing '98
, 1998
"... We study the problem of reducing the size of a language model while preserving recognition performance (accuracy and speed). A successful approach has been to represent language models by weighted finitestate automata (WFAs). Analogues of classical automata determinization and minimization algorith ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We study the problem of reducing the size of a language model while preserving recognition performance (accuracy and speed). A successful approach has been to represent language models by weighted finitestate automata (WFAs). Analogues of classical automata determinization and minimization algorithms then provide a general method to produce smaller but equivalent WFAs. We extend this approach by introducing the notion of approximate determinization. We provide an algorithm that, when applied to language models for the North American Business task, achieves 2535% size reduction compared to previous techniques, with negligible effects on recognition time and accuracy. 1. INTRODUCTION An important goal of language model engineering is to produce small language models that guarantee fast and accurate automatic speech recognition (ASR). In practice we see tradeoffs: e.g., in size vs. accuracy and in accuracy vs. speed. There has been recent progress, however, on automatic methods for r...
Rigorous Approximated Determinization of Weighted Automata
"... Abstract—A nondeterministic weighted finite automaton (WFA) maps an input word to a numerical value. Applications of weighted automata include formal verification of quantitative properties, as well as text, speech, and image processing. Many of these applications require the WFAs to be deterministi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—A nondeterministic weighted finite automaton (WFA) maps an input word to a numerical value. Applications of weighted automata include formal verification of quantitative properties, as well as text, speech, and image processing. Many of these applications require the WFAs to be deterministic, or work substantially better when the WFAs are deterministic. Unlike NFAs, which can always be determinized, not all WFAs have an equivalent deterministic weighted automaton (DWFA). In [1], Mohri describes a determinization construction for a subclass of WFA. He also describes a property of WFAs (the twins property), such that all WFAs that satisfy the twins property are determinizable and the algorithm terminates on them. Unfortunately, many natural WFAs cannot be determinized. In this paper we study approximated determinization of WFAs. We describe an algorithm that, given a WFA A and an approximation factor t ≥ 1, constructs a DWFA A ′ that tdeterminizes A. Formally, for all words w ∈ Σ ∗ , the value of w in A ′ is at least its value in A and at most t times its value in A. Our construction involves two new ideas: attributing states in the subset construction by both upper and lower residues, and collapsing attributed subsets whose residues can be tightened. The larger the approximation factor is, the more attributed subsets we can collapse. Thus, tdeterminization is helpful not only for WFAs that cannot be determinized, but also in cases determinization is possible but results in automata that are too big to handle. In addition, tdeterminization is useful for reasoning about the competitive ratio of online algorithms. We also describe a property (the ttwins property) and use it in order to characterize tdeterminizable WFAs. Finally, we describe a polynomial algorithm for deciding whether a given WFA has the ttwins property. Index Terms—Weighted automata; Determinization; I.
A Fast Lexically Constrained Viterbi Algorithm For Online Handwriting Recognition
 In Proc. 7th International Workshop on Frontiers in Handwriting Recognition
, 2000
"... : Most online cursive handwriting recognition systems use a lexical constraint to help improve the recognition performance. Traditionally, the vocabulary lexicon is stored in a trie (automaton whose underlying graph is a tree). In this paper, we propose a solution based on a more compact data st ..."
Abstract
 Add to MetaCart
: Most online cursive handwriting recognition systems use a lexical constraint to help improve the recognition performance. Traditionally, the vocabulary lexicon is stored in a trie (automaton whose underlying graph is a tree). In this paper, we propose a solution based on a more compact data structure, the directed acyclic word graph (DAWG). We show that our solution is equivalent to the traditional system. Moreover, we propose a number of heuristics to reduce the size of the DAWG and present experimental results demonstrating a significant improvement. 1 Introduction Since the pioneering work of Vintsyuk [17] on Automatic Speech Recognition (ASR) systems, it is well known that Hidden Markov Models (HMM) [13] and Dynamic Programming (DP) [3], [12], provide a theoretical framework and practical algorithms for temporal pattern recognition with lexical constraints (even for large vocabularies). The techniques initially developed for ASR are also applicable to Handwriting Recogni...
AN OPTIMAL PATH CODING SYSTEM FOR DAWG LEXICONHMM
"... Lexical constraints on the input of speech and online handwriting systems improve the performance of such systems. A significant gain in speed can be achieved by integrating in a digraph structure the different Hidden Markov Models (HMM) corresponding to the words of the relevant lexicon. This inte ..."
Abstract
 Add to MetaCart
Lexical constraints on the input of speech and online handwriting systems improve the performance of such systems. A significant gain in speed can be achieved by integrating in a digraph structure the different Hidden Markov Models (HMM) corresponding to the words of the relevant lexicon. This integration avoids redundant computations by sharing intermediate results between HMM's corresponding to different words of the lexicon. In this paper, we introduce a token passing method to perform simultaneously the computation of the a posteriori probabilities of all the words of the lexicon. The coding scheme that we introduce for the tokens is optimal in the information theory sense. The tokens use the minimum possible number of bits. Overall, we optimize simultaneously the execution speed and the memory requirement of the recognition systems. 1.