• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Graph connectivity measures for unsupervised word sense disambiguation (0)

by R Navigli, M Lapata
Venue:Proceedings of IJCAI
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

Wikify!: linking documents to encyclopedic knowledge

by Rada Mihalcea, Andras Csomai - In CIKM ’07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , 2007
"... This paper introduces the use of Wikipedia as a resource for automatic keyword extraction and word sense disambiguation, and shows how this online encyclopedia can be used to achieve state-of-the-art results on both these tasks. The paper also shows how the two methods can be combined into a system ..."
Abstract - Cited by 57 (3 self) - Add to MetaCart
This paper introduces the use of Wikipedia as a resource for automatic keyword extraction and word sense disambiguation, and shows how this online encyclopedia can be used to achieve state-of-the-art results on both these tasks. The paper also shows how the two methods can be combined into a system able to automatically enrich a text with links to encyclopedic knowledge. Given an input document, the system identifies the important concepts in the text and automatically links these concepts to the corresponding Wikipedia pages. Evaluations of the system show that the automatic annotations are reliable and hardly distinguishable from manual annotations. providing the users a quick way of accessing additional information. Wikipedia contributors perform these annotations by hand following a Wikipedia“manual of style,”which gives guidelines concerning the selection of important concepts in a text, as well as the assignment of links to appropriate related articles. For instance, Figure 1 shows an example of a Wikipedia page, including the definition for one of the meanings of the word “plant.”

Word sense disambiguation: a survey

by Roberto Navigli - ACM COMPUTING SURVEYS , 2009
"... Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the ..."
Abstract - Cited by 28 (9 self) - Add to MetaCart
Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the motivations for solving the ambiguity of words and provide a description of the task. We overview supervised, unsupervised, and knowledge-based approaches. The assessment of WSD systems is discussed in the context of the Senseval/Semeval campaigns, aiming at the objective evaluation of systems participating in several different disambiguation tasks. Finally, applications, open problems, and future directions are discussed.

Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs

by Zornitsa Kozareva, Ellen Riloff, Eduard Hovy - PROCEEDINGS OF ACL-08 , 2008
"... We present a novel approach to weakly supervised semantic class learning from the web, using a single powerful hyponym pattern combined with graph structures, which capture two properties associated with pattern-based extractions: popularity and productivity. Intuitively, a candidate is popular if i ..."
Abstract - Cited by 25 (7 self) - Add to MetaCart
We present a novel approach to weakly supervised semantic class learning from the web, using a single powerful hyponym pattern combined with graph structures, which capture two properties associated with pattern-based extractions: popularity and productivity. Intuitively, a candidate is popular if it was discovered many times by other instances in the hyponym pattern. A candidate is productive if it frequently leads to the discovery of other instances. Together, these two measures capture not only frequency of occurrence, but also cross-checking that the candidate occurs both near the class name and near other class members. We developed two algorithms that begin with just a class name and one seed instance and then automatically generate a ranked list of new class instances. We conducted experiments on four semantic classes and consistently achieved high accuracies.

An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation

by Roberto Navigli, Mirella Lapata - IEEE TPAMI
"... Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has been a long-standing research objective for natural language processing. In this paper we are concerned with graph-based algorithms for large-scale WSD. Under this framework, finding the ..."
Abstract - Cited by 14 (9 self) - Add to MetaCart
Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has been a long-standing research objective for natural language processing. In this paper we are concerned with graph-based algorithms for large-scale WSD. Under this framework, finding the right sense for a given word amounts to identifying the most “important ” node among the set of graph nodes representing its senses. We introduce a graph-based WSD algorithm which has few parameters and does not require sense annotated data for training. Using this algorithm, we investigate several measures of graph connectivity with the aim of identifying those best suited for WSD. We also examine how the chosen lexicon and its connectivity influences WSD performance. We report results on standard data sets, and show that our graph-based approach performs comparably to the state of the art.

Knowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD

by Eneko Agirre, Oier Lopez De Lacalle, Aitor Soroa
"... This paper explores the application of knowledgebased Word Sense Disambiguation systems to specific domains, based on our state-of-the-art graphbased WSD system that uses the information in WordNet. Evaluation was performed over a publicly available domain-specific dataset of 41 words related to Spo ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
This paper explores the application of knowledgebased Word Sense Disambiguation systems to specific domains, based on our state-of-the-art graphbased WSD system that uses the information in WordNet. Evaluation was performed over a publicly available domain-specific dataset of 41 words related to Sports and Finance, comprising examples drawn from three corpora: one balanced corpus (BNC), and two domain-specific corpora (news related to Sports and Finance). The results show that in all three corpora our knowledge-based WSD algorithm improves over previous results, and also over two state-of-the-art supervised WSD systems trained on SemCor, the largest publicly available annotated corpus. We also show that using related words as context, instead of the actual occurrence contexts, yields better results on the domain datasets, but not on the general one. Interestingly, the results are higher for domain-specific corpus than for the general corpus, raising prospects for improving current WSD systems when applied to specific domains. 1

Graph connectivity measures for unsupervised parameter tuning of graph-based sense induction systems

by Ioannis Korkontzelos, Ioannis Klapaftis, Suresh Man , 2009
"... Word Sense Induction (WSI) is the task of identifying the different senses (uses) of a target word in a given text. This paper focuses on the unsupervised estimation of the free parameters of a graph-based WSI method, and explores the use of eight Graph Connectivity Measures (GCM) that assess the de ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Word Sense Induction (WSI) is the task of identifying the different senses (uses) of a target word in a given text. This paper focuses on the unsupervised estimation of the free parameters of a graph-based WSI method, and explores the use of eight Graph Connectivity Measures (GCM) that assess the degree of connectivity in a graph. Given a target word and a set of parameters, GCM evaluate the connectivity of the produced clusters, which correspond to subgraphs of the initial (unclustered) graph. Each parameter setting is assigned a score according to one of the GCM and the highest scoring setting is then selected. Our evaluation on the nouns of SemEval-2007 WSI task (SWSI) shows that: (1) all GCM estimate a set of parameters which significantly outperform the worst performing parameter setting in both SWSI evaluation schemes, (2) all GCM estimate a set of parameters which outperform the Most Frequent Sense (MFS) baseline by a statistically significant amount in the supervised evaluation scheme, and (3) two of the measures estimate a set of parameters that performs closely to a set of parameters estimated in supervised manner. 1

Unsupervised All-words Word Sense Disambiguation with Grammatical Dependencies

by unknown authors
"... We present experiments that analyze the necessity of using a highly interconnected word/sense graph for unsupervised allwords word sense disambiguation. We show that allowing only grammatically related words to influence each other’s senses leads to disambiguation results on a par with the best grap ..."
Abstract - Add to MetaCart
We present experiments that analyze the necessity of using a highly interconnected word/sense graph for unsupervised allwords word sense disambiguation. We show that allowing only grammatically related words to influence each other’s senses leads to disambiguation results on a par with the best graph-based systems, while greatly reducing the computation load. We also compare two methods for computing selectional preferences between the senses of every two grammatically related words: one using a Lesk-based measure on WordNet, the other using dependency relations from the British National Corpus. The best configuration uses the syntactically-constrained graph, selectional preferences computed from the corpus and a PageRank tie-breaking algorithm. We especially note good performance when disambiguating verbs with grammatically constrained links. 1

A Knowledge-Based Information Extraction Prototype for Data-Rich Documents in the . . .

by Sergio Gonzalo Jimenez Vargas , 2008
"... ..."
Abstract - Add to MetaCart
Abstract not found

An Extended Analysis of a Method of . . .

by Varada Kolhatkar , 2009
"... ..."
Abstract - Add to MetaCart
Abstract not found

An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation

by George Tsatsaronis, Iraklis Varlamis
"... Abstract. Recent research works on unsupervised word sense disambiguation report an increase in performance, which reduces their handicap from the respective supervised approaches for the same task. Among the latest state of the art methods, those that use semantic graphs reported the best results. ..."
Abstract - Add to MetaCart
Abstract. Recent research works on unsupervised word sense disambiguation report an increase in performance, which reduces their handicap from the respective supervised approaches for the same task. Among the latest state of the art methods, those that use semantic graphs reported the best results. Such methods create a graph comprising the words to be disambiguated and their corresponding candidate senses. The graph is expanded by adding semantic edges and nodes from a thesaurus. The selection of the most appropriate sense per word occurrence is then made through the use of graph processing algorithms that offer a degree of importance among the graph vertices. In this paper we experimentally investigate the performance of such methods. We additionally evaluate a new method, which is based on a recently introduced algorithm for computing similarity between graph vertices, P-Rank. We evaluate the performance of all alternatives in two benchmark data sets, Senseval 2 and 3, using WordNet. The current study shows the differences in the performance of each method, when applied on the same semantic graph representation, and analyzes the pros and cons of each method for each part of speech separately. Furthermore, it analyzes the levels of inter-agreement in the sense selection level, giving further insight on how these methods could be employed in an unsupervised ensemble for word sense disambiguation. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University