Abstract:
In recent years there have been significant advances in the field of Unsupervised Grammar Inference (UGI) for Natural Languages such as English or Dutch. This paper presents a broad range of UGI implementations, where we can begin to see how the theory has been put in to practise. Several mature systems are emerging, built using complex models and capable of deriving natural language grammatical phenomena. The range of systems is classified into: models based on Categorial Grammar (GraSp, CLL, EMILE); Memory Based Learning models (FAMBL, RISE); Evolutionary computing models (ILM, LAgts); and string-pattern searches (ABL, GB). An objectively measurable statistical comparison of performance Of the systems reviewed is not yet feasible. However, their merits and shortfalls are discussed, as well as a look at what the future has in store for UGI.
Citations
|
674
|
Language identification in the limit
– Gold
- 1967
|
|
81
|
Forgetting exceptions is harmful in language learning
– Daelemans, Bosch, et al.
- 1999
|
|
59
|
Speech and Language Processing
– Jurafsky, Martin
- 2000
|
|
53
|
Language identi in the limit
– Gold
- 1967
|
|
47
|
Gene structure prediction by linguistics methods
– Dong, Searls
- 1994
|
|
47
|
Finding Structure in Language
– Finch
- 1993
|
|
43
|
On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing
– Bates, Goodman
- 1997
|
|
39
|
The emergence of linguistic structure: an overview of the iterated learning model
– Kirby, Hurford
- 2002
|
|
32
|
Language Learning from a Categorial Perspective. Doctoral dissertation, Proefschrift
– Adriaans
- 1992
|
|
28
|
Die syntaktische Konnexität’. Studia Philosophica 1:1–27
– Ajdukiewicz
- 1935
|
|
26
|
2000a) ‘Grammatical Acquisition: Inductive Bias and Coevolution of Language and the Language Acquisition Device’, Language
– Briscoe
|
|
18
|
Bootstrapping Structure into Language: Alignment-Based Learning
– Zaanen
- 2002
|
|
14
|
A comparative evaluation of modern English corpus grammatical annotation schemes
– Atwell, Demetriou, et al.
- 2000
|
|
14
|
Natural Language from
– Kirby
- 2002
|
|
11
|
Careful abstraction from instance families in memory-based language learning
– Bosch
- 1999
|
|
9
|
Grammatical bigrams
– Paskin
- 2001
|
|
7
|
Pattern recognition applied to the acquisition of a grammatical classification system from unrestricted English text
– Atwell, Drakos
- 1987
|
|
6
|
Comparing two unsupervised grammar induction systems: Alignment-Based Learning vs. EMILE
– Zaanen, Adriaans
- 2001
|
|
5
|
Using eigenvectors of the bigram graph to infer morpheme identity
– Belkin, Goldsmith
- 2002
|
|
5
|
Alternative conceptions of phrase structure
– Baltin
- 1989
|
|
4
|
Visualisation of Long Distance Grammatical Collocation Patterns in Language
– Elliott, Atwell, et al.
- 2001
|
|
4
|
GraSp: Grammar Learning from unlabelled speech corpora
– Henrichsen
- 2002
|
|
4
|
EAGLES Final Report and guidelines for the syntactic annotation of corpora
– Leech, Barnett, et al.
- 1996
|
|
4
|
Automatic acquisition of word classification using distributional analysis of content words with respect to function words
– Roberts
- 2002
|
|
2
|
Using Grammatical Inference to Improve Precision in Information Extraction
– Freltag
- 1997
|
|
2
|
Using Grammatical Inference to Automate Extraction from the Web
– Hong, Clark
- 2001
|
|
2
|
Ealks and Grammars
– Games
- 2000
|
|
2
|
A corpus for interstellar communication
– Atwell, Elliott
- 2001
|
|
2
|
Ealks and Grammars
– Vervoort
- 2000
|
|
1
|
Natural Language from Arti Life
– Kirby
- 1962
|
|
1
|
Automatic acquisition of word classi using distributional analysis of content words with respect to function words
– Roberts
- 2002
|
|
1
|
Zaanen and P.W. Adriaans. Comparing Two Unsupervised Grammar Induction Systems: Alignment-Based Learning vs
– van
- 2001
|
|
1
|
Psychologically Plausible and Computationally E#ective Approach to Learning Syntax
– Watkinson, Manandhar
- 2001
|