Results 1 - 10
of
17
MindNet: acquiring and structuring semantic information from text
, 1998
"... As a lexical knowledge base constructed automatically from the definitions and example sentences in two machine-readable dictionaries (MRDs), MindNet embodies several features that distinguish it from prior work with MRDs. It is, however, more than this static resource alone. MindNet represents a ge ..."
Abstract
-
Cited by 78 (2 self)
- Add to MetaCart
As a lexical knowledge base constructed automatically from the definitions and example sentences in two machine-readable dictionaries (MRDs), MindNet embodies several features that distinguish it from prior work with MRDs. It is, however, more than this static resource alone. MindNet represents a general methodology for acquiring, structuring, accessing, and exploiting semantic information from natural language text. This paper provides an overview of the distinguishing characteristics of MindNet, the steps involved in its creation, and its extension beyond dictionary text. 1
WordNet 2 - A Morphologically and Semantically Enhanced Resource
- University of Maryland
, 1999
"... This paper presents an on-going project intended to enhance WordNet morphologically and semantically. The motivation for this work steams from the current limitations of WordNet when used as a linguistic knowledge base. We envision a software tool that automatically parses the conceptual defining gl ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
This paper presents an on-going project intended to enhance WordNet morphologically and semantically. The motivation for this work steams from the current limitations of WordNet when used as a linguistic knowledge base. We envision a software tool that automatically parses the conceptual defining glosses, attributing part-of-speech tags and phrasal brackets. The nouns, verbs, adjectives and adverbs from every de nition are then disambiguated and linked to the corresponding synsets. This increases the connectivity between synsets allowing the retrieval of topically related concepts. Furthermore, the tool transforms the glosses, first into logical forms, and then into semantic forms. Using derivational morphology new links are added between the synsets. 1 Motivation WordNet has already been recognized as a valuable resource in the human language technology and knowledge processing communities. Its applicability has been cited in more than 200 papers and systems have been...
Word sense disambiguation: a survey
- ACM COMPUTING SURVEYS
, 2009
"... Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the motivations for solving the ambiguity of words and provide a description of the task. We overview supervised, unsupervised, and knowledge-based approaches. The assessment of WSD systems is discussed in the context of the Senseval/Semeval campaigns, aiming at the objective evaluation of systems participating in several different disambiguation tasks. Finally, applications, open problems, and future directions are discussed.
Making Sense About Sense
- WORD SENSE DISAMBIGUATION: ALGORITHMS AND APPLICATIONS
, 2006
"... We first reconsider the role of lexicographers in word-sense disambiguation as a computational task, as providers of both legacy material (dictionaries) and special test material for competitions like SENSEVAL. We suggest that the standard fine-grained division of senses and (larger) homographs by a ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
We first reconsider the role of lexicographers in word-sense disambiguation as a computational task, as providers of both legacy material (dictionaries) and special test material for competitions like SENSEVAL. We suggest that the standard fine-grained division of senses and (larger) homographs by a lexicographer for use by a human reader may not be an appropriate goal for the computational WSD task. We argue that the level of sense-discrimination that NLP needs corresponds roughly to homographs, though we discuss psycholinguistic evidence that there are broad sense divisions with some etymological derivation (i.e. non-homographic) that are as distinct for humans as homographic ones and they may be part of the broad class of sensedivisions we seek to identify here. Fifteen years or more of WSD research has shown that it is this kind of discrimination that existing WSD programs are able to capture at the ~95% success level, whereas the full lexicographicallyderived division of senses seems to remain too hard for both programs and human discriminators. We link this discussion to the observation that major NLP tasks like MT and IR seem not to need independent WSD modules of the sort produced in the research field, even though they are undoubtedly doing WSD by other means. Our conclusion is that WSD should continue to focus on these broad discriminations, at which it can do very well, thereby possibly offering the close-to-100% success that IR needs (especially search-engine, rather than classic long-query) IR, and assume that this is what most NLP requires, with the possible exception of very fine questions of target word choice in MT. This proposal can be seen as reorienting WSD to what it can actually perform at the standard success levels, but we argue that this, rather...
Foreground and background lexicons and word sense disambiguation for information extraction
- In Proc. Workshop on Lexicon Driven Information Extraction
, 1997
"... In recent years, lexicon acquisition from machine-readable dictionaries and corpora has been a dynamic field of research. However it has not always been evident how lexical information so acquired can be used, or how it relates to more structured meaning representations. In this paper I look at this ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
In recent years, lexicon acquisition from machine-readable dictionaries and corpora has been a dynamic field of research. However it has not always been evident how lexical information so acquired can be used, or how it relates to more structured meaning representations. In this paper I look at this issue in relation to one particular NLP task, Information Extraction
MRDs, Standards and How To Do Lexical Engineering
, 1995
"... How can you obtain a satisfactory lexicon for a modern NLP application? Ten years ago the answer you might have received was to wait for the large-scale lexicons soon to be derived from machine-readable dictionaries (MRDs). Five years ago the advice would have proclaimed the virtue of standards; ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
How can you obtain a satisfactory lexicon for a modern NLP application? Ten years ago the answer you might have received was to wait for the large-scale lexicons soon to be derived from machine-readable dictionaries (MRDs). Five years ago the advice would have proclaimed the virtue of standards; these were currently being agreed, and once that was done, easy-to-use lexical resources complying with them would become available. Today, earlier optimism on these fronts is muted. In this position paper, we explore the history and background and present an alternative methodology for effective lexical engineering in the medium term. Also published in Proceedings, Second Language Engineering Convention, pp 125--132. London, October 1995. This research was supported by the EPSRC Grant K18931, SEAL: Structural Enhancement of Automatically-acquired Lexicons. Information Technology Research Institute Technical Report Series ITRI, Univ. of Brighton, Lewes Road, Brighton BN2 4AT, UK T...
Extracting lexical reference rules from Wikipedia
, 2009
"... This paper describes the extraction from Wikipedia of lexical reference rules, identifying references to term meanings triggered by other terms. We present extraction methods geared to cover the broad range of the lexical reference relation and analyze them extensively. Most extraction methods yield ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
This paper describes the extraction from Wikipedia of lexical reference rules, identifying references to term meanings triggered by other terms. We present extraction methods geared to cover the broad range of the lexical reference relation and analyze them extensively. Most extraction methods yield high precision levels, and our rule-base is shown to perform better than other automatically constructed baselines in a couple of lexical expansion and matching tasks. Our rule-base yields comparable performance to Word-Net while providing largely complementary information. 1
Viewpoint-Based Measurement of Semantic Similarity between Words
- Journal of Information Processing Society of Japan, Vol.35, No.3
, 1995
"... A method of measuring semantic similarity between words using a knowledge-base constructed automatically from machine-readable dictionaries is proposed. The method takes into consideration the fact that similarity changes depending on situation or context, which we call `viewpoint '. Evaluation show ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A method of measuring semantic similarity between words using a knowledge-base constructed automatically from machine-readable dictionaries is proposed. The method takes into consideration the fact that similarity changes depending on situation or context, which we call `viewpoint '. Evaluation shows the proposed method, although based on a simply structured knowledgebase, is superior to other currently available methods. 41.1 Introduction Measuring semantic similarity between words is important for natural language processing when searching text, performing analogical and cases-based reasoning, designing flexible human interfaces to databases, and other tasks. We are exploring methods for measuring the similarity between large numbers of daily-use words with an aim toward general applications. In measuring similarity, however, we consider that similarity changes depending on situation or context, which we call `viewpoint'. For example, `horse' is more similar to `pig' than `car' from...
UNITEX-PB, a set of flexible language resources for Brazilian Portuguese ∗
"... Abstract. This work documents the project and development of various computational linguistic resources that support the Brazilian Portuguese language according to the formal methodology used by the corpus processing system called UNITEX. The delivered resources include computational lexicons, libra ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. This work documents the project and development of various computational linguistic resources that support the Brazilian Portuguese language according to the formal methodology used by the corpus processing system called UNITEX. The delivered resources include computational lexicons, libraries to access compressed lexicons, and additional tools to validate those resources. 1.
The CQC Algorithm: Cycling in Graphs to Semantically Enrich and Enhance a Bilingual Dictionary
"... Bilingual machine-readable dictionaries are knowledge resources useful in many automatic tasks. However, compared to monolingual computational lexicons like WordNet, bilingual dictionaries typically provide a lower amount of structured information such as lexical and semantic relations, and often do ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Bilingual machine-readable dictionaries are knowledge resources useful in many automatic tasks. However, compared to monolingual computational lexicons like WordNet, bilingual dictionaries typically provide a lower amount of structured information such as lexical and semantic relations, and often do not cover the entire range of possible translations for a word of interest. In this paper we present Cycles and Quasi-Cycles (CQC), a novel algorithm for the automated disambiguation of ambiguous translations in the lexical entries of a bilingual machine-readable dictionary. The dictionary is represented as a graph, and cyclic patterns are sought in this graph to assign an appropriate sense tag to each translation in a lexical entry. Further, we use the algorithm’s output to improve the quality of the dictionary itself, by suggesting accurate solutions to structural problems such as misalignments, partial alignments and missing entries. Finally, we successfully apply CQC to the task of synonym extraction. 1.

