Results 1 -
6 of
6
A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation
, 2000
"... This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classifiers, each of which is based on lexical features that represent co-occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classifiers, each of which is based on lexical features that represent co-occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves accuracy rivaling the best previously published results.
Applying System Combination to Base Noun Phrase Identification
- In Proceedings of COLING 2000
, 2000
"... Wc use seven machine learning algorithms one t;sk: idenl, it)ing base nom phrases. The results have been processed by (lifin'ent system confi)ination methods and all of these outpertbrmed the best individual result. Wc lmw applied the sewm learners wil, h tim best, combinatot, a majo1'it,y vot ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Wc use seven machine learning algorithms one t;sk: idenl, it)ing base nom phrases. The results have been processed by (lifin'ent system confi)ination methods and all of these outpertbrmed the best individual result. Wc lmw applied the sewm learners wil, h tim best, combinatot, a majo1'it,y vote of the 1,o t) five sysl,elnS, to a sta.l(lard (bt; sol, and mmaged lt) ilnl)rove the t)cst published resull; tbr this (lata set.
Tiered Tagging and Combined Language Models Classifiers
, 1999
"... We address the problem of morpho-syntactic disambiguation of arbitrary texts in a highly in ectional natural language. We use a large tagset (615 tags), EAGLES and MULTEXT compliant [5]. The large tagset is internally mapped onto a reduced one (82 tags), serving statistical disambiguation, and a te ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
We address the problem of morpho-syntactic disambiguation of arbitrary texts in a highly in ectional natural language. We use a large tagset (615 tags), EAGLES and MULTEXT compliant [5]. The large tagset is internally mapped onto a reduced one (82 tags), serving statistical disambiguation, and a text disambiguated in terms of this tagset is subsequently subject to a recovery process of all the information left out from the large tagset. This two step process is called tiered tagging. To further improve the tagging accuracy we use a combined language models classi er, a procedure that interpolates the results of tagging the same text with several register-specific language models.
Unsupervised Italian Word Sense Disambiguation using Wordnets And Unlabeled Corpora
- In Proceedings of SigLEX’02
, 2002
"... This paper presents a novel method for unsupervised word sense disambiguation, which combines multiple information sources, including semantic relations, large unlabeled corpora, and cross-lingual distributional statistics. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper presents a novel method for unsupervised word sense disambiguation, which combines multiple information sources, including semantic relations, large unlabeled corpora, and cross-lingual distributional statistics.
Evaluating parts-of-speech taggers for use in a text-to-scene conversion system
- SAICSIT 2005 South African Institute of Computer Scientists and Information Technologists
, 2005
"... This paper presents parts-of-speech tagging as a rst step towards an autonomous text-to-scene conversion system. It categorizes some freely available taggers, according to the techniques used by each in order to automatically identify word-classes. In addition, the performance of each identi ed tagg ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
This paper presents parts-of-speech tagging as a rst step towards an autonomous text-to-scene conversion system. It categorizes some freely available taggers, according to the techniques used by each in order to automatically identify word-classes. In addition, the performance of each identi ed tagger is veri ed experimentally. The SUSANNE corpus is used for testing and reveals the complexity of working with di erent tagsets, resulting in substantially lower accuracies in our tests than in those reported by the developers of each tagger. The taggers are then grouped to form a voting system to attempt to raise accuracies, but in no cases do the combined results improve upon the individual accuracies. Additionally a new metric, agreement, is tentatively proposed as an indication of con dence in the output of a group of taggers where such output cannot be validated.
Using Existing Systems to Supplement Small Amounts of Annotated Grammatical Relations Training Data
, 2000
"... Grammatical relationships (GRs) form an important level of natural language processing, but different sets of GRs are useful for different purposes. Therefore, one may often only have time to obtain a small training corpus with the desired GR annotations. To boost the performance from using such a s ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Grammatical relationships (GRs) form an important level of natural language processing, but different sets of GRs are useful for different purposes. Therefore, one may often only have time to obtain a small training corpus with the desired GR annotations. To boost the performance from using such a small training corpus on a transformation rule learner, we use existing systems that find related types of annotations.

