Many of the processing steps in natural language engineering can be performed using finite state transducers. An optimal way to create such transducers is to compile them from regular expressions. This paper is an introduction to the regular expression calculus, extended with certain operators that have proved very useful in natural language applications ranging from tokenization to light parsing. The examples in the paper illustrate in concrete detail some of these applications.
|
312
|
Two-Level Morphology: A General Computational Model for WordForm Recognition and Production
– Koskenniemi
- 1983
|
|
246
|
Parsing by Chunks
– Abney
- 1991
|
|
234
|
Regular models of phonological rule systems
– Kaplan, Kay
- 1994
|
|
161
|
A Simple Rule-Based Part-Of-Speech Tagger
– Brill
- 1996
|
|
109
|
Fastus: A finite-state processor for information extraction from real-world text
– Appelt, Hobbs, et al.
- 1993
|
|
58
|
Two-level morphology with composition
– Karttunen, Kaplan, et al.
- 1992
|
|
55
|
Use of syntactic context to produce term association lists for text retrieval
– Grefenstette
- 1992
|
|
51
|
Dictionnaires électroniques et analyse automatique de textes: le système INTEX
– Silberztein
- 1993
|
|
45
|
The replace operator
– Karttunen
- 1995
|
|
40
|
Constructing Lexical Transducers
– Karttunen
- 1994
|
|
38
|
On Some Applications of Finite-State Automata Theory to
– Mohri
- 1996
|
|
37
|
Two-Level Rule Compiler
– Karttunen, Beesley
- 1992
|
|
35
|
Adaptive sentence boundary disambiguation
– Palmer, Hearst
- 1994
|
|
34
|
Directed Replacement
– Karttunen
- 1996
|
|
20
|
Compiling and using finite-state syntactic rules
– Koskenniemi, Tapanainen, et al.
- 1992
|
|
13
|
Light parsing as finite-state filtering
– Grefenstette
- 1998
|
|
12
|
Multilingual finite-state noun phrase extraction
– Schiller
- 1996
|
|
10
|
Parallel Replacement in the Finite-State Calculus
– Kempe, Karttunen
- 1996
|
|
8
|
Finite-State Morphology
– Beesley, Karttunen
- 1997
|
|
5
|
Deterministic Part-of-Speech Tagging
– Roche, Schabes
- 1993
|
|
3
|
A Non-deterministic Tokenizer for Finite-State Parsing
– Chanod, Gilman, et al.
- 1996
|
|
3
|
Computation of Syntactic Structure
– Joshi
- 1961
|
|
3
|
FASTUS: A nite-state processor for information extraction from real-world text
– Appelt, Hobbs, et al.
- 1993
|
|
2
|
A simple syntactic approach for the generation of indexing phrases
– Salton, Zhao, et al.
- 1990
|
|
2
|
Finite-State Based Reductionist Parsing for French. Kornai, András (ed.), Extended Finite State Models of Language
– Chanod, Tapanainen
- 1997
|
|
1
|
Finite-State Based Reductionist Parsing for
– Chanod, Tapanainen
- 1997
|
|
1
|
Parallel Replacement intheFinite-State Calculus
– Kempe, Karttunen
- 1996
|
|
1
|
Two-level Morphology. AGeneral Computational Model for Word-Form Recognition and Production. Department ofGeneral Linguistics
– Koskenniemi
- 1983
|
|
1
|
Compiling and using nite-state syntactic rules
– Koskenniemi, Tapanainen, et al.
- 1992
|