Results 1 - 10
of
57
WordNet::Similarity -- Measuring the Relatedness of Concepts
, 2004
"... WordNet::Similarity is a freely available software package that makes it possible to measure the semantic similarity or relatedness between a pair of concepts (or word senses). It provides six measures of similarity, and three measures of relatedness, all of which are based on the lexical databa ..."
Abstract
-
Cited by 141 (3 self)
- Add to MetaCart
WordNet::Similarity is a freely available software package that makes it possible to measure the semantic similarity or relatedness between a pair of concepts (or word senses). It provides six measures of similarity, and three measures of relatedness, all of which are based on the lexical database WordNet. These measures are implemented as Perl modules which take as input two concepts, and return a numeric value that represents the degree to which they are similar or related.
An adapted lesk algorithm for word sense disambiguation using wordnet
- In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics
, 2002
"... This is to certify that I have examined this copy of master’s thesis by ..."
Abstract
-
Cited by 73 (2 self)
- Add to MetaCart
This is to certify that I have examined this copy of master’s thesis by
Maximizing Semantic Relatedness to Perform Word Sense Disambiguation
, 2003
"... This article presents a method of word sense disambiguation that assigns a target word the sense that is most related to the senses of its neighboring words. We explore the use of measures of similarity and relatedness that are based on finding paths in a concept network, information content derived ..."
Abstract
-
Cited by 43 (0 self)
- Add to MetaCart
This article presents a method of word sense disambiguation that assigns a target word the sense that is most related to the senses of its neighboring words. We explore the use of measures of similarity and relatedness that are based on finding paths in a concept network, information content derived from a large corpus, and word sense glosses. We observe that measures of relatedness are useful sources of information for disambiguation, and in particular we find that two gloss based measures that we have developed are particularly flexible and e#ective measures for word sense disambiguation.
Corpus-based and knowledge-based measures of text semantic similarity
- In IProceedings of the 21st national conference on Artificial intelligence - Volume 1
, 2006
"... This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large documents (e.g. text classification, information retrieval) or individual words (e.g. synonymy ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large documents (e.g. text classification, information retrieval) or individual words (e.g. synonymy tests). Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snippets (e.g. abstracts of scientific documents, imagine captions, product descriptions), in this paper we focus on measuring the semantic similarity of short texts. Through experiments performed on a paraphrase data set, we show that the semantic similarity method outperforms methods based on simple lexical matching, resulting in up to 13 % error rate reduction with respect to the traditional vector-based similarity metric.
Co-occurrence retrieval: A flexible framework for lexical distributional similarity
- Computational Linguistics
, 2005
"... Techniques that exploit knowledge of distributional similarity between words have been proposed in many areas of Natural Language Processing. For example, in language modeling, the sparse data problem can be alleviated by estimating the probabilities of unseen co-occurrences of events from the proba ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Techniques that exploit knowledge of distributional similarity between words have been proposed in many areas of Natural Language Processing. For example, in language modeling, the sparse data problem can be alleviated by estimating the probabilities of unseen co-occurrences of events from the probabilities of seen co-occurrences of similar events. In other applications, distributional similarity is taken to be an approximation to semantic similarity. However, due to the wide range of potential applications and the lack of a strict definition of the concept of distributional similarity, many methods of calculating distributional similarity have been proposed or adopted. In this work, a flexible, parameterized framework for calculating distributional similarity is proposed. Within this framework, the problem of finding distributionally similar words is cast as one of co-occurrence retrieval (CR) for which precision and recall can be measured by analogy with the way they are measured in document retrieval. As will be shown, a number of popular existing measures of distributional similarity are simulated with parameter settings within the CR framework. In this article, the CR framework is then used to systematically investigate three fundamental questions concerning distributional similarity. First, is the relationship of lexical similarity necessarily symmetric, or are there advantages to be gained from considering it as an asymmetric relationship? Second, are some co-occurrences inherently more salient than others in the calculation of distributional similarity? Third, is it necessary to consider the difference in the extent to which each word occurs in each co-occurrence type? Two application-based tasks are used for evaluation: automatic thesaurus generation and pseudo-disambiguation. It is possible to achieve significantly better results on both these tasks by varying the parameters within the CR framework rather than using other existing distributional similarity measures; it will also be shown that any single unparameterized measure is unlikely to be able to do better on both tasks. This is due to an inherent asymmetry in lexical substitutability and therefore also in lexical distributional similarity. 1.
Automatic interpretation of noun compounds using WordNet similarity
- In Proceedings of the 2nd International Joint Conference on Natural Language Processing, Jeju Island, South Korea, 11–13
, 2005
"... Abstract. The paper introduces a method for interpreting novel noun compounds with semantic relations. The method is built around word similarity with pretagged noun compounds, based onWordNet::Similarity. Over 1,088 training instances and 1,081 test instances from the Wall Street Journal in the Pen ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
Abstract. The paper introduces a method for interpreting novel noun compounds with semantic relations. The method is built around word similarity with pretagged noun compounds, based onWordNet::Similarity. Over 1,088 training instances and 1,081 test instances from the Wall Street Journal in the Penn Treebank, the proposed method was able to correctly classify 53.3 % of the test noun compounds. We also investigated the relative contribution of the modifier and the head noun in noun compounds of different semantic types. 1
Distributional measures as proxies for semantic relatedness
- In submission
, 2005
"... Abstract. The automatic ranking of word pairs as per their semantic relatedness and ability to mimic human notions of semantic relatedness has widespread applications. Measures that rely on raw data (distributional measures) and those that use knowledge-rich ontologies both exist. Although extensive ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Abstract. The automatic ranking of word pairs as per their semantic relatedness and ability to mimic human notions of semantic relatedness has widespread applications. Measures that rely on raw data (distributional measures) and those that use knowledge-rich ontologies both exist. Although extensive studies have been performed to compare ontological measures with human judgment, the distributional measures have primarily been evaluated by indirect means. This paper is a detailed study of some of the major distributional measures; it lists their respective merits and limitations. New measures that overcome these drawbacks, that are more in line with the human notions of semantic relatedness, are suggested. The paper concludes with an exhaustive comparison of the distributional and ontology-based measures. Along the way, significant research problems are identified. Work on these problems may lead to a better understanding of how semantic relatedness is to be measured.
Element Level Semantic Matching
- IN PROCEEDINGS OF MEANING COORDINATION AND NEGOTIATION WORKSHOP AT ISWC
, 2004
"... We think of Match as an operator which takes two graph-like structures and produces a mapping between semantically related nodes . The matching process is essentially divided into two steps: element level and structure level. Element level matchers consider only labels of nodes, while structure ..."
Abstract
-
Cited by 14 (10 self)
- Add to MetaCart
We think of Match as an operator which takes two graph-like structures and produces a mapping between semantically related nodes . The matching process is essentially divided into two steps: element level and structure level. Element level matchers consider only labels of nodes, while structure level matchers start from this information to consider the full graph. In this paper we present various element level semantic matchers, and discuss their implementation within the S-Match system. The main novelty of our approach is in that element level semantic matchers return semantic relations (=, , , ^) between concepts rather than similarity coefficients between labels in the [0, 1] range.
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web
- In: 7 th ACM Intern. Workshop on Web Information and Data Management (WIDM 2005
, 2005
"... Semantic Similarity relates to computing the similarity between concepts which are not lexicographically similar. We investigate approaches to computing semantic similarity by mapping terms (concepts) to an ontology and by examining their relationships in that ontology. Some of the most popular sema ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
Semantic Similarity relates to computing the similarity between concepts which are not lexicographically similar. We investigate approaches to computing semantic similarity by mapping terms (concepts) to an ontology and by examining their relationships in that ontology. Some of the most popular semantic similarity methods are implemented and evaluated using WordNet as the underlying reference ontology. Building upon the idea of semantic similarity, a novel information retrieval method is also proposed. This method is capable of detecting similarities between documents containing semantically similar but not necessarily lexicographically similar terms. The proposed method has been evaluated in retrieval of images and documents on the Web. The experimental results demonstrated very promising performance improvements over state-of-the-art information retrieval methods.
Using WordNet-based context vectors to estimate the semantic relatedness of concepts
- In: Proceedings of the EACL
, 2006
"... In this paper, we introduce a WordNetbased measure of semantic relatedness by combining the structure and content of WordNet with co–occurrence information derived from raw text. We use the co–occurrence information along with the WordNet definitions to build gloss vectors corresponding to each conc ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
In this paper, we introduce a WordNetbased measure of semantic relatedness by combining the structure and content of WordNet with co–occurrence information derived from raw text. We use the co–occurrence information along with the WordNet definitions to build gloss vectors corresponding to each concept in Word-Net. Numeric scores of relatedness are assigned to a pair of concepts by measuring the cosine of the angle between their respective gloss vectors. We show that this measure compares favorably to other measures with respect to human judgments of semantic relatedness, and that it performs well when used in a word sense disambiguation algorithm that relies on semantic relatedness. This measure is flexible in that it can make comparisons between any two concepts without regard to their part of speech. In addition, it can be adapted to different domains, since any plain text corpus can be used to derive the co–occurrence information. 1

