Results 1 - 10
of
112
A Maximum-Entropy-Inspired Parser
, 1999
"... We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan- dard" sections of ..."
Abstract
-
Cited by 671 (16 self)
- Add to MetaCart
We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan- dard" sections of the Wall Street Journal tree- bank. This represents a 13% decrease in error rate over the best single-parser results on this corpus [9]. The major technical innova- tion is the use of a "maximum-entropy-inspired" model for conditioning and smoothing that let us successfully to test and combine many different conditioning events. We also present some partial results showing the effects of different conditioning information, including a surprising 2% improvement due to guessing the lexical head's pre-terminal before guessing the lexical head.
Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging
- Computational Linguistics
, 1995
"... this paper, we will describe a simple rule-based approach to automated learning of linguistic knowledge. This approach has been shown for a number of tasks to capture information in a clearer and more direct fashion without a compromise in performance. We present a detailed case study of this learni ..."
Abstract
-
Cited by 662 (7 self)
- Add to MetaCart
this paper, we will describe a simple rule-based approach to automated learning of linguistic knowledge. This approach has been shown for a number of tasks to capture information in a clearer and more direct fashion without a compromise in performance. We present a detailed case study of this learning method applied to part of speech tagging
A Simple Rule-Based Part of Speech Tagger
, 1992
"... Automatic part of speech tagging is an area of natural language processing where statistical techniques have been more successful than rule- based methods. In this paper, we present a sim- ple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy coinparable ..."
Abstract
-
Cited by 433 (10 self)
- Add to MetaCart
Automatic part of speech tagging is an area of natural language processing where statistical techniques have been more successful than rule- based methods. In this paper, we present a sim- ple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy coinparable to stochastic taggers. The rule-based tagger has many advantages over these taggers, including: a vast reduction in stored information required, the perspicuity of a sinall set of meaningful rules, ease of finding and implementing improvements to the tagger, and better portability from one tag set, cor- pus genre or language to another. Perhaps the biggest contribution of this work is in demonstrating that the stochastic method is not the only viable method for part of speech tagging. The fact that a simple rule-based tagger that automatically learns its rules can perform so well should offer encouragement for researchers to further explore rule-based tagging, searching for a better and more expressive set of rule templates and other variations on the simple but effective theme described below.
Probabilistic Part-of-Speech Tagging Using Decision Trees
, 1994
"... In this paper, a new probabilistic tagging method is presented which avoids problems that Markov Model based taggers face, when they have to estimate transition probabilities from sparse data. In this tagging method, transition probabilities are estimated using a decision tree. Based on this method, ..."
Abstract
-
Cited by 413 (4 self)
- Add to MetaCart
In this paper, a new probabilistic tagging method is presented which avoids problems that Markov Model based taggers face, when they have to estimate transition probabilities from sparse data. In this tagging method, transition probabilities are estimated using a decision tree. Based on this method, a part-of-speech tagger (called TreeTagger) has been implemented which achieves 96.36 % accuracy on Penn-Treebank data which is better than that of a trigram tagger (96.06 %) on the same data. Keywords: Corpus-based NLP, Statistical NLP, Part-of-Speech Tagging. 1 Introduction Word forms are often ambiguous in their part-of-speech (POS). The English word form store for example can be either a noun, a finite verb or an infinitive. In an utterance, this ambiguity is normally resolved by the context of a word: e.g. in the sentence "The 1977 PCs could store two pages of data.", store can only be an infinitive. The predictability of the part-of-speech from the context is used by automatic part-...
A practical part-of-speech tagger
- IN PROCEEDINGS OF THE THIRD CONFERENCE ON APPLIED NATURAL LANGUAGE PROCESSING
, 1992
"... We present an implementation of a part-of-speech tagger based on a hidden Markov model. The methodology enables robust and accurate tagging with few resource requirements. Only a lexicon and some unlabeled training text are required. Accuracy exceeds 96%. We describe implementation strategies and op ..."
Abstract
-
Cited by 325 (5 self)
- Add to MetaCart
We present an implementation of a part-of-speech tagger based on a hidden Markov model. The methodology enables robust and accurate tagging with few resource requirements. Only a lexicon and some unlabeled training text are required. Accuracy exceeds 96%. We describe implementation strategies and optimizations which result in high-speed operation. Three applications for tagging are described: phrase recognition; word sense disambiguation; and grammatical function assignment.
Some advances in transformation-based part-of-speech tagging
- In Proceedings of the Twelfth National Conference on Artificial Intelligence
, 1994
"... Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of thousands of lexical and contextual probabilities. In (Brill 1992), a trainable rule-based tagger wa ..."
Abstract
-
Cited by 227 (1 self)
- Add to MetaCart
Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of thousands of lexical and contextual probabilities. In (Brill 1992), a trainable rule-based tagger was described that obtained performance comparable to that of stochastic taggers, but captured relevant linguistic information in a small number of simple non-stochastic rules. In this paper, we describe a number of extensions to this rule-based tagger. First, we describe a method for expressing lexical relations in tagging that stochastic taggers are currently unable to express. Next, we show a rule-based approach to tagging unknown words. Finally, we show how the tagger can be extended into a k-best tagger, where multiple tags can be assigned to words in some cases of uncertainty.
Tagging English Text with a Probabilistic Model
, 1994
"... In this paper we present some experiments on the use of a probabilistic model to tag English text, i.e. to assign to each word the correct tag (part of speech) in the context of the sentence. The main novelty of these experiments is the use of untagged text in the training of the model. We have used ..."
Abstract
-
Cited by 212 (0 self)
- Add to MetaCart
In this paper we present some experiments on the use of a probabilistic model to tag English text, i.e. to assign to each word the correct tag (part of speech) in the context of the sentence. The main novelty of these experiments is the use of untagged text in the training of the model. We have used a simple triclass Markov model and are looking for the best way to estimate the parameters of this model, depending on the kind and amount of training data provided. Two approaches in particular are compared and combined: using text that has been tagged by hand and computing relative frequency counts, using text without tags and training the model as a hidden Markov process, according to a Maximum Likelihood principle
Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars
- COMPUTATIONAL LINGUISTICS
, 1993
"... ..."
MBT: A Memory-Based Part of Speech Tagger-Generator
- PROC. OF FOURTH WORKSHOP ON VERY LARGE CORPORA
, 1996
"... We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approac ..."
Abstract
-
Cited by 168 (47 self)
- Add to MetaCart
We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approaches are useful when a tagged corpus is available as an example of the desired output of the tagger. Based on such a corpus, the tagger-generator automatically builds a tagger which is able to tag new text the same way, diminishing development time for the construction of a tagger considerably. Memory-based tagging shares this advantage with other statistical or machine learning approaches. Additional advantages specific to a memory-based approach include (i) the relatively small tagged corpus size sufficient for training, (ii) incremental learning, (iii) explanation capabilities, (iv) flexible integration of information in case representations, (v) its non-parametric nature, (vi) reasonably good results on unknown words without morphological analysis, and (vii) fast learning and tagging. In this paper we show that a large-scale application of the memory-based approach is feasible: we obtain a tagging accuracy that is on a par with that of known statistical approaches, ad with attractive space and time complexity properties when using IGTree, a tree-based formalism for indexing and searching huge case bases. The use of IGTree has as additional advantage that optimal context size for disambiguation is dynamically computed.

