Results 1 - 10
of
27
Co-occurrence retrieval: A flexible framework for lexical distributional similarity
- Computational Linguistics
, 2005
"... Techniques that exploit knowledge of distributional similarity between words have been proposed in many areas of Natural Language Processing. For example, in language modeling, the sparse data problem can be alleviated by estimating the probabilities of unseen co-occurrences of events from the proba ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Techniques that exploit knowledge of distributional similarity between words have been proposed in many areas of Natural Language Processing. For example, in language modeling, the sparse data problem can be alleviated by estimating the probabilities of unseen co-occurrences of events from the probabilities of seen co-occurrences of similar events. In other applications, distributional similarity is taken to be an approximation to semantic similarity. However, due to the wide range of potential applications and the lack of a strict definition of the concept of distributional similarity, many methods of calculating distributional similarity have been proposed or adopted. In this work, a flexible, parameterized framework for calculating distributional similarity is proposed. Within this framework, the problem of finding distributionally similar words is cast as one of co-occurrence retrieval (CR) for which precision and recall can be measured by analogy with the way they are measured in document retrieval. As will be shown, a number of popular existing measures of distributional similarity are simulated with parameter settings within the CR framework. In this article, the CR framework is then used to systematically investigate three fundamental questions concerning distributional similarity. First, is the relationship of lexical similarity necessarily symmetric, or are there advantages to be gained from considering it as an asymmetric relationship? Second, are some co-occurrences inherently more salient than others in the calculation of distributional similarity? Third, is it necessary to consider the difference in the extent to which each word occurs in each co-occurrence type? Two application-based tasks are used for evaluation: automatic thesaurus generation and pseudo-disambiguation. It is possible to achieve significantly better results on both these tasks by varying the parameters within the CR framework rather than using other existing distributional similarity measures; it will also be shown that any single unparameterized measure is unlikely to be able to do better on both tasks. This is due to an inherent asymmetry in lexical substitutability and therefore also in lexical distributional similarity. 1.
A Topic Model for Word Sense Disambiguation
, 2007
"... We develop latent Dirichlet allocation with WORDNET (LDAWN), an unsupervised probabilistic topic model that includes word sense as a hidden variable. We develop a probabilistic posterior inference algorithm for simultaneously disambiguating a corpus and learning the domains in which to consider each ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
We develop latent Dirichlet allocation with WORDNET (LDAWN), an unsupervised probabilistic topic model that includes word sense as a hidden variable. We develop a probabilistic posterior inference algorithm for simultaneously disambiguating a corpus and learning the domains in which to consider each word. Using the WORDNET hierarchy, we embed the construction of Abney and Light (1999) in the topic model and show that automatically learned domains improve WSD accuracy compared to alternative contexts.
Word sense disambiguation using label propagation based semi-supervised learning
- Proceedings of the ACL
, 2005
"... Shortage of manually sense-tagged data is an obstacle to supervised word sense disambiguation (WSD) methods. In this paper we investigate a label propagation based semi-supervised learning algorithm for WSD, which combines unlabeled data with labeled data in learning process by representing labeled ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Shortage of manually sense-tagged data is an obstacle to supervised word sense disambiguation (WSD) methods. In this paper we investigate a label propagation based semi-supervised learning algorithm for WSD, which combines unlabeled data with labeled data in learning process by representing labeled and unlabeled examples as vertices in a weighted graph and iteratively propagating the label information from any vertex to nearby vertices until this process converges. This label propagation process realizes a global consistency assumption: similar examples should have similar labels. Our experimental results on benchmark corpora indicate that it consistently outperforms SVM when only very few labeled examples are available, and its performance is also better than monolingual bootstrapping, and comparable to bilingual bootstrapping. 1
SenseRelate::TargetWord – A generalized framework for word sense disambiguation
- In Proceedings of the 20th National Conference on Artificial Intelligence
, 2005
"... Many words in natural language have different meanings when used in different contexts. SenseRelate::TargetWord is a Perl package that disambiguates a target word in context by finding the sense that is most related to its neighbors according to a WordNet::Similarity measure of relatedness. ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Many words in natural language have different meanings when used in different contexts. SenseRelate::TargetWord is a Perl package that disambiguates a target word in context by finding the sense that is most related to its neighbors according to a WordNet::Similarity measure of relatedness.
Source-Language Entailment Modeling for Translating Unknown Terms
"... This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by allowing translations of entailed texts rather than paraphr ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by allowing translations of entailed texts rather than paraphrases only. A method for performing this process efficiently is presented and applied to some 2500 sentences with unknown terms. Our experiments show that the proposed approach substantially increases the number of properly translated texts. 1
Classifying particle semantics in English verb-particle constructions
- In Proceedings of the ACL-2006 Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
, 2006
"... Previous computational work on learning the semantic properties of verb-particle constructions (VPCs) has focused on their compositionality, and has left unaddressed the issue of which meaning of the component words is being used in a given VPC. We develop a feature space for use in classification o ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Previous computational work on learning the semantic properties of verb-particle constructions (VPCs) has focused on their compositionality, and has left unaddressed the issue of which meaning of the component words is being used in a given VPC. We develop a feature space for use in classification of the sense contributed by the particle in a VPC, and test this on VPCs using the particle up. The features that capture linguistic properties of VPCs that are relevant to the semantics of the particle outperform linguistically uninformed word co-occurrence features in our experiments on unseen test VPCs. 1
Word sense disambiguation using sense examples automatically acquired from a second language
- IN PROCEEDINGS OF HLT/EMNLP
, 2005
"... We present a novel almost-unsupervised approach to the task of Word Sense Disambiguation (WSD). We build sense examples automatically, using large quantities of Chinese text, and English-Chinese and Chinese-English bilingual dictionaries, taking advantage of the observation that mappings between wor ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We present a novel almost-unsupervised approach to the task of Word Sense Disambiguation (WSD). We build sense examples automatically, using large quantities of Chinese text, and English-Chinese and Chinese-English bilingual dictionaries, taking advantage of the observation that mappings between words and meanings are often different in typologically distant languages. We train a classifier on the sense examples and test it on a gold standard English WSD dataset. The evaluation gives results that exceed previous state-of-the-art results for comparable systems. We also demonstrate that a little manual effort can improve the quality of sense examples, as measured by WSD accuracy. The performance of the classifier on WSD also improves as the number of training sense examples increases.
Extracting key phrases to disambiguate personal names on the web
- In CICLing
, 2006
"... Assume that you are looking for information about a particular person. A search engine returns many pages for that person’s name. Some of these pages may be on other people with the same name. One method to reduce the ambiguity in the query and filter out the irrelevant pages, is by adding a phrase ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Assume that you are looking for information about a particular person. A search engine returns many pages for that person’s name. Some of these pages may be on other people with the same name. One method to reduce the ambiguity in the query and filter out the irrelevant pages, is by adding a phrase that uniquely identifies the person we are interested in from his/her namesakes. We propose an unsupervised algorithm that extracts such phrases from the Web. We represent each document by a term-entity model and cluster the documents using a contextual similarity metric. We evaluate the algorithm on a dataset of ambiguous names. Our method outperforms baselines, achieving over 80 % accuracy and significantly reduces the ambiguity in a web search task. 1
Estimating class priors in domain adaptation for word sense disambiguation
- In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
, 2006
"... Instances of a word drawn from different domains may have different sense priors (the proportions of the different senses of a word). This in turn affects the accuracy of word sense disambiguation (WSD) systems trained and applied on different domains. This paper presents a method to estimate the se ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Instances of a word drawn from different domains may have different sense priors (the proportions of the different senses of a word). This in turn affects the accuracy of word sense disambiguation (WSD) systems trained and applied on different domains. This paper presents a method to estimate the sense priors of words drawn from a new domain, and highlights the importance of using well calibrated probabilities when performing these estimations. By using well calibrated probabilities, we are able to estimate the sense priors effectively to achieve significant improvements in WSD accuracy. 1

