Results 1  10
of
12
Incremental Construction of Minimal Acyclic Finite State Automata and Transducers
, 1998
"... In this paper, we describe a new method for constructing mi, lmal, determin istic, acyclic finite state automata and transducers. Traditional methods consist of two steps. The first one is to construct a trie, the second one  to perform minimization. Our approach is to construct an automaton i ..."
Abstract

Cited by 55 (6 self)
 Add to MetaCart
In this paper, we describe a new method for constructing mi, lmal, determin istic, acyclic finite state automata and transducers. Traditional methods consist of two steps. The first one is to construct a trie, the second one  to perform minimization. Our approach is to construct an automaton in a single step by adding new strings one by one and minjmizin the resulting automaton onthefly. We present a general algorithm as well as a specialization that relies upon the lexicographical sorting of the input strings.
Fast String Correction with LevenshteinAutomata
 INTERNATIONAL JOURNAL OF DOCUMENT ANALYSIS AND RECOGNITION
, 2002
"... The Levenshteindistance between two words is the minimal number of insertions, deletions or substitutions that are needed to transform one word into the other. Levenshteinautomata of degree n for a word W are defined as finite state automata that regognize the set of all words V where the Levensht ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
The Levenshteindistance between two words is the minimal number of insertions, deletions or substitutions that are needed to transform one word into the other. Levenshteinautomata of degree n for a word W are defined as finite state automata that regognize the set of all words V where the Levenshteindistance between V and W does not exceed n. We show how to compute, for any fixed bound n and any input word W , a deterministic Levenshteinautomaton of degree n for W in time linear in the length of W . Given an electronic dictionary that is implemented in the form of a trie or a finite state automaton, the Levenshteinautomaton for W can be used to control search in the lexicon in such a way that exactly the lexical words V are generated where the Levenshteindistance between V and W does not exceed the given bound. This leads to a very fast method for correcting corrupted input words of unrestricted text using large electronic dictionaries. We then introduce a second method that avoids the explicit computation of Levenshteinautomata and leads to even improved eciency. We also describe how to extend both methods to variants of the Levenshteindistance where further primitive edit operations (transpositions, merges and splits) may be used.
A Taxonomy of Algorithms for Constructing Minimal Acyclic Deterministic Finite Automata
 Proc. Workshop on Implementing Automata
, 1999
"... this paper, we present a taxonomy of algorithms for constructing minimal acyclic deterministic finite automata (MADFAs). MADFAs represent finite languages and are therefore useful in applications such as storing words for spellchecking, computer and biological virus searching, text indexing and XML ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
this paper, we present a taxonomy of algorithms for constructing minimal acyclic deterministic finite automata (MADFAs). MADFAs represent finite languages and are therefore useful in applications such as storing words for spellchecking, computer and biological virus searching, text indexing and XML tag lookup. In such applications, the automata can grow extremely large (with more than 10
Automata for nogood recording in constraint satisfaction problems
 In CP06 Workshop on the Integration of SAT and CP techniques
, 2006
"... Abstract. Nogood recording is a well known technique for reducing the thrashing encountered by tree search algorithms. One of the most significant disadvantages of nogood recording has been its prohibitive space complexity. In this paper we attempt to mitigate this by using an automaton to compactly ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Nogood recording is a well known technique for reducing the thrashing encountered by tree search algorithms. One of the most significant disadvantages of nogood recording has been its prohibitive space complexity. In this paper we attempt to mitigate this by using an automaton to compactly represent a set of nogoods. We demonstrate how nogoods can be propagated using a known algorithm for achieving generalised arc consistency. Our experimental results on a number of benchmark problems demonstrate the utility of our approach. 1
How to Squeeze a Lexicon
 Software Practice and Experience
, 2000
"... Minimal acyclic deterministic finite automata (ADFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of strings is involved. This paper aims to popularize an efficient b ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Minimal acyclic deterministic finite automata (ADFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of strings is involved. This paper aims to popularize an efficient but little known algorithm for creating minimal ADFAs recognizing a finite language, developed independently by several authors. The algorithm is presented for three variants of ADFAs, its minor improvements are discussed, and minimal ADFAs are compared to competitive data structures.
Direct Construction of Minimal Acyclic Subsequential Transducers
, 2001
"... This paper presents an algorithm for direct building of minimal acyclic subsequential transducer, which represents a finite relation given as a sorted list of words with their outputs. The algorithm constructs the minimal transducer directly without constructing intermediate treelike or pseudomini ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
This paper presents an algorithm for direct building of minimal acyclic subsequential transducer, which represents a finite relation given as a sorted list of words with their outputs. The algorithm constructs the minimal transducer directly without constructing intermediate treelike or pseudominimal transducers. In NLP applications our algorithm provides significantly better efficiency than the other algorithms building minimal transducer for largescale natural language dictionaries. Some experimental comparisons are presented at the end of the paper.
Experimental Study Of Finite Automata Storing Static Lexicons
, 1999
"... Minimal acyclic deterministic finite automata (DFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of words is involved. We present ecient algorithm for creating variou ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Minimal acyclic deterministic finite automata (DFAs) can be used as a compact representation of string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large number of words is involved. We present ecient algorithm for creating various forms of DFAs recognizing a finite language and compare the DFAs to other techniques.
www.elsevier.com/locate/scico A new algorithm for the construction of minimal acyclic DFAs
"... We present a semiincremental algorithm for constructing minimal acyclic deterministic /nite automata. Such automata are useful for storing sets of words for spellchecking, among other applications. The algorithm is semiincremental because it maintains the automaton in nearly minimal condition and ..."
Abstract
 Add to MetaCart
(Show Context)
We present a semiincremental algorithm for constructing minimal acyclic deterministic /nite automata. Such automata are useful for storing sets of words for spellchecking, among other applications. The algorithm is semiincremental because it maintains the automaton in nearly minimal condition and requires a /nal minimization step after the last word has been added (during construction). The algorithm derivation proceeds formally (with correctness arguments) from two separate algorithms, one for minimization and one for adding words to acyclic automata. The algorithms are derived in such a way as to be combinable, yielding a semiincremental one. In practice, the algorithm is both easy to implement and displays good running time performance.
Software—Practice and Experience 2001; 31(11):1077–1090
, 2002
"... Minimal acyclic deterministic finite automata (ADFAs) can be used as a compact representation of finite string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large collection of strings is involved. This paper aims to popularize an ..."
Abstract
 Add to MetaCart
(Show Context)
Minimal acyclic deterministic finite automata (ADFAs) can be used as a compact representation of finite string sets with fast access time. Creating them with traditional algorithms of DFA minimization is a resource hog when a large collection of strings is involved. This paper aims to popularize an efficient but little known algorithm for creating minimal ADFAs recognizing a finite language, invented independently by several authors. The algorithm is presented for three variants of ADFAs, its minor improvements are discussed, and minimal ADFAs are compared to competitive data structures. KEY WORDS: static lexicon; static dictionary; trie compaction; directed acyclic graph; acyclic finite automaton
PROOFING TOOLS TECHNOLOGY AT NEUROSOFT S.A.
"... The aim of this paper is to present the R&D activities carried out at Neurosoft S.A. regarding the development of proofing tools for Modern Greek. Firstly, we focus on infrastructure issues that we faced during our initial steps. Subsequently, we describe the most important insights of three pro ..."
Abstract
 Add to MetaCart
(Show Context)
The aim of this paper is to present the R&D activities carried out at Neurosoft S.A. regarding the development of proofing tools for Modern Greek. Firstly, we focus on infrastructure issues that we faced during our initial steps. Subsequently, we describe the most important insights of three proofing tools developed by Neurosoft, i.e. the spelling checker, the hyphenator and the thesaurus, outlining their efficiencies and inefficiencies. Finally, we discuss some improvement ideas and give our future directions. 1.