Results 11 - 20
of
99
Exemplar-Based Word Sense Disambiguation: Some Recent Improvements
, 1997
"... In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of k, the number of nearest neighbors to use for determining the class of a test example, and through 10-fold ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of k, the number of nearest neighbors to use for determining the class of a test example, and through 10-fold cross validation to automatically determine the best k, we have obtained improved disambiguation accuracy on a large sense-tagged corpus first used in (Ng and Lee, 1996). The accuracy achieved by our improved exemplar-based classifier is comparable to the accuracy on the same data set obtained by the Naive-Bayes algorithm, which was reported in (Mooney, 1996) to have the highest disambiguation accuracy among seven state-of-the-art machine learning algorithms.
Exploring automatic word sense disambiguation with decision lists and the Web
- Proceedings of the Semantic Annotation And Intelligent
, 2000
"... The most effective paradigm for word sense disambiguation, supervised learning, seems to be stuck because of the knowledge acquisition bottleneck. In this paper we take an in-depth study of the performance of decision lists on two publicly available corpora and an additional corpus automatical ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
The most effective paradigm for word sense disambiguation, supervised learning, seems to be stuck because of the knowledge acquisition bottleneck. In this paper we take an in-depth study of the performance of decision lists on two publicly available corpora and an additional corpus automatically acquired from the Web, using the fine-grained highly polysemous senses in WordNet. Decision lists are shown a versatile state-of-the-art technique. The experiments reveal, among other facts, that SemCor can be an acceptable (0.7 precision for polysemous words) starting point for an all-words system. The results on the DSO corpus show that for some highly polysemous words 0.7 precision seems to be the current state-of-the-art limit. On the other hand, independently constructed hand-tagged corpora are not mutually useful, and a corpus automatically acquired from the Web is shown to fail. Introduction Recent trends in word sense disambiguation (Ide & Veronis, 1998) show that ...
A Classification Approach to Word Prediction
, 2000
"... The eventual goal of a language model is to accurately predict the value of a missing word given its context. We present an approach to word prediction that is based on learning a representation for each word as a function of words and linguistics predicates in its context. This approach raises a fe ..."
Abstract
-
Cited by 33 (8 self)
- Add to MetaCart
The eventual goal of a language model is to accurately predict the value of a missing word given its context. We present an approach to word prediction that is based on learning a representation for each word as a function of words and linguistics predicates in its context. This approach raises a few new questions that we address. First, in order to learn good word representations it is necessary to use an expressive representation of the context. We present a way that uses external knowledge to generate expressive context representations, along with a learning method capable of handling the large number of features generated this way that can, potentially, contribute to each prediction. Second, since the number of words "competing" for each prediction is large, there is a need to "focus the attention" on a smaller subset of these. We exhibit the contribution of a "focus of attention" mechanism to the performance of the word predictor. Finally, we describe a large scale experimental study in which the approach presented is shown to yield significant improvements in word prediction tasks.
A Sequential Model for Multi-Class Classification. EMNLP ’01
, 2001
"... Many classification problems require decisions among a large number of competing classes. These tasks, however, are not handled well by general purpose learning methods and are usually addressed in an ad-hoc fashion. We suggest a general approach – a sequential learning model that utilizes classifie ..."
Abstract
-
Cited by 32 (11 self)
- Add to MetaCart
Many classification problems require decisions among a large number of competing classes. These tasks, however, are not handled well by general purpose learning methods and are usually addressed in an ad-hoc fashion. We suggest a general approach – a sequential learning model that utilizes classifiers to sequentially restrict the number of competing classes while maintaining, with high probability, the presence of the true outcome in the candidates set. Some theoretical and computational properties of the model are discussed and we argue that these are important in NLP-like domains. The advantages of the model are illustrated in an experiment in partof-speech tagging. 1
Word Translation Disambiguation Using Bilingual Bootstrapping
- COMPUTATIONAL LINGUISTICS
, 2002
"... This paper proposes a new method for word translation disambiguation using a machine learning technique called `Bilingual Bootstrapping'. Bilingual Bootstrapping makes use of # in learning# a small number of classified data and a large number of unclassified data in the source and the tar ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
This paper proposes a new method for word translation disambiguation using a machine learning technique called `Bilingual Bootstrapping'. Bilingual Bootstrapping makes use of # in learning# a small number of classified data and a large number of unclassified data in the source and the target languages in translation. It constructs classifiers in the two languages in parallel and repeatedly boosts the performances of the classifiers by further classifying data in each of the two languages and by exchanging between the two languages information regarding the classified data. Experimental results indicate that word translation disambiguation based on Bilingual Bootstrapping consistently and significantly outperforms the existing methods based on `Monolingual Bootstrapping'.
Word sense disambiguation: a survey
- ACM COMPUTING SURVEYS
, 2009
"... Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the motivations for solving the ambiguity of words and provide a description of the task. We overview supervised, unsupervised, and knowledge-based approaches. The assessment of WSD systems is discussed in the context of the Senseval/Semeval campaigns, aiming at the objective evaluation of systems participating in several different disambiguation tasks. Finally, applications, open problems, and future directions are discussed.
Hierarchical Decision Lists for Word Sense Disambiguation
- Computers and the Humanities
, 1999
"... This paper describes a supervised algorithm for word sense disambiguation based on hierarchies of decision lists. This algorithm supports a useful degree of conditional branching while minimizing the training data fragmentation typical of decision trees. Classifications are based on a rich set of co ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
This paper describes a supervised algorithm for word sense disambiguation based on hierarchies of decision lists. This algorithm supports a useful degree of conditional branching while minimizing the training data fragmentation typical of decision trees. Classifications are based on a rich set of collocational, morphological and syntactic contextual features, extracted automatically from training data and weighted sensitive to the nature of the feature and feature class. The algorithm is evaluated comprehensively in the senseval framework, achieving the top performance of all participating supervised systems on the 36 test words where training data is available. Keywords: word sense disambiguation, decision lists, supervised machine learning, lexical ambiguity resolution, senseval 1. Introduction Decision lists have been shown to be effective at a wide variety of lexical ambiguity resolution tasks including word sense disambiguation (Yarowsky, 1994, 1995; Mooney, 1996; Wilks and S...
Knowledge Sources for Word-Level Translation Models
- In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing
, 2001
"... We present various methods to train word-level translation models for statistical machine translation systems that use widely different knowledge sources ranging from parallel corpora and a bilingual lexicon to only monolingual corpora in two languages. Some novel methods are presented and previousl ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
We present various methods to train word-level translation models for statistical machine translation systems that use widely different knowledge sources ranging from parallel corpora and a bilingual lexicon to only monolingual corpora in two languages. Some novel methods are presented and previously published methods are reviewed. Also, a common evaluation metric enables the first quantitative comparison of these approaches.
Corpus-based Approaches to Semantic Interpretation in Natural . . .
, 1997
"... This article is an introduction to some of the emerging research in the application of corpusbased learning techniques to problems in semantic interpretation. In particular, we focus on two important problems in semantic interpretation, namely, word-sense disambiguation and semantic parsing ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
This article is an introduction to some of the emerging research in the application of corpusbased learning techniques to problems in semantic interpretation. In particular, we focus on two important problems in semantic interpretation, namely, word-sense disambiguation and semantic parsing
Naive Bayes and Exemplar-Based approaches to Word Sense Disambiguation Revisited
- In Proceedings of the 14th European Conference on Artificial Intelligence
, 2000
"... . This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar--based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing info ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
. This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar--based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing information about the comparison between both methods appearing in the related literature. In doing so, several directions have been explored, including: testing several modifications of the basic learning algorithms and varying the feature space. Secondly, an improvement of both algorithms is proposed, in order to deal with large attribute sets. This modification, which basically consists in using only the positive information appearing in the examples, allows to improve greatly the efficiency of the methods, with no loss in accuracy. The experiments have been performed on the largest sense--tagged corpus available containing the most frequent and ambiguous English words. Results show that the Exemplar-based approach to WSD is generally superior to the Bayesian approach, especially when a specific metric for dealing with symbolic attributes is used.

