Results 1 -
6 of
6
Applications of Finite Automata Representing Large Vocabularies
, 1992
"... The construction of minimal acyclic deterministic partial finite automata to represent large natural language vocabularies is described. Applications of such automata include: spelling checkers and advisers, multilanguage dictionaries, thesauri, minimal perfect hashing and text compression. Part of ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
The construction of minimal acyclic deterministic partial finite automata to represent large natural language vocabularies is described. Applications of such automata include: spelling checkers and advisers, multilanguage dictionaries, thesauri, minimal perfect hashing and text compression. Part of this research was supported by a grant awarded by the Brazilian National Council for Scientific and Technological Development (CNPq) to the second author. Authors' Address: Cl'audio L. Lucchesi and Tomasz Kowaltowski, Department of Computer Science, University of Campinas, Caixa Postal 6065, 13081 Campinas, SP, Brazil. E-mail: lucchesi@dcc.unicamp.br and tomasz@dcc.unicamp.br. 1 Introduction The use of finite automata (see for instance [5]) to represent sets of words is a well established technique. Perhaps the most traditional application is found in compiler construction where such automata can be used to model and implement efficient lexical analyzers (see [1]). Applications of finit...
Non-word identification or spell checking without a dictionary
- Journal of the American Society for Information Science and Technology
, 2004
"... MEDLINE � is a collection of more than 12 million references and abstracts covering recent life science literature. With its continued growth and cutting-edge terminology, spell-checking with a traditional lexicon based approach requires significant additional manual followup. In this work, an inter ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
MEDLINE � is a collection of more than 12 million references and abstracts covering recent life science literature. With its continued growth and cutting-edge terminology, spell-checking with a traditional lexicon based approach requires significant additional manual followup. In this work, an internal corpus based context quality rating �, frequency, and simple misspelling transformations are used to rank words from most likely to be misspellings to least likely. Eleven-point average precisions of 0.891 have been achieved within a class of 42,340 all alphabetic words having an � score less than 10. Our models predict that 16,274 or 38 % of these words are misspellings. Based on test data, this result has a recall of 79 % and a precision of 86%. In other words, spell checking can be done by statistics instead of with a dictionary. As an application we examine the time history of low � words in MEDLINE � titles and abstracts.
Software Synthesis via Domain-Specific Software Architectures
, 1992
"... Current software engineering practice concentrates on improving the process by which a programmer develops a solution from the description of a problem; we describe a new paradigm for software synthesis based on Domain-Specific Software Architectures (DSSAs) that eliminates this process entirely. A ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Current software engineering practice concentrates on improving the process by which a programmer develops a solution from the description of a problem; we describe a new paradigm for software synthesis based on Domain-Specific Software Architectures (DSSAs) that eliminates this process entirely. A DSSA provides an overall software design that solves a whole class of problems in a broad area. It focuses the designer's attention on the unique requirements of the current problem, suppressing those that are common to all problems of the type addressed by that DSSA. To use the DSSA approach, a software engineer provides a description of the unique requirements of a particular problem. A solution to that problem is then generated according to the DSSAs overall design by a system that implements the DSSA. Problem descriptions are checked for consistency by the system, and the generated software is guaranteed to solve the problem described. We briefly describe how we have used the DSSA approa...
Finite Automata and Efficient Lexicon Implementation
, 1988
"... We describe a general technique for the encoding of lexical functions --- such as lexical classification, gender and number marking, inflections and conjugations --- using minimized acyclic finite-state automata. This technique has been used to store a Portuguese lexicon with over 2 million entries ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe a general technique for the encoding of lexical functions --- such as lexical classification, gender and number marking, inflections and conjugations --- using minimized acyclic finite-state automata. This technique has been used to store a Portuguese lexicon with over 2 million entries in about 1 megabyte. Unlike general file compression schemes, this representation allows random access to the stored data. Moreover it allows the lexical functions and their inverses to be computed at negligible cost. The technique can be easily adapted to practically any language or lexical classification scheme, and this task does not require any knowledge of the programs or data structures. 1 Introduction A minimized acyclic finite automaton provides an efficient technique for storing and retrieving a finite set of strings over a finite alphabet. The most obvious usage, within the domain of natural language processing, is the representation of the vocabulary of a language, without any ad...
Assignment as the Sole Means of Updating Objects
, 1994
"... this paper, no differentiation is made between functions and procedures. All programs are called routines, some of these may return values. ..."
Abstract
- Add to MetaCart
this paper, no differentiation is made between functions and procedures. All programs are called routines, some of these may return values.
Software---Practice And Experience, Vol. 24(9), 835--870 (september 1994)
"... this paper, no differentiation is made between functions and procedures. All programs are called routines, some of these may return values ..."
Abstract
- Add to MetaCart
this paper, no differentiation is made between functions and procedures. All programs are called routines, some of these may return values

