Automatic stochastic tagging of natural language texts (1995)
| Venue: | Computational Linguistics |
| Citations: | 48 - 4 self |
BibTeX
@ARTICLE{Dermatas95automaticstochastic,
author = {Evangelos Dermatas and George Kokkinakis},
title = {Automatic stochastic tagging of natural language texts},
journal = {Computational Linguistics},
year = {1995},
volume = {21},
pages = {137--163}
}
Years of Citing Articles
OpenURL
Abstract
Five language and tagset independent stochastic taggers, handling morphological and contextual information, are presented and tested in corpora of seven European languages (Dutch, English, French, German, Greek, Italian and Spanish), using two sets of grammatical tags; a small set containing the eleven main grammatical classes and a large set of grammatical categories common to all languages. The unknown words are tagged using an experimentally proven stochastic hypothesis that links the stochastic behavior of the unknown words with that of the less probable known words. A fully automatic training and tagging program has been implemented on an IBM PC-compatible 80386-based computer. Measurements of error rate, time response, and memory requirements have shown that the taggers " performance is satisfactory, even though a small training text is available. The error rate is improved when new texts are used to update the stochastic model parameters. 1.







