• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Selection and Information: A Class-Based Approach to Lexical Relationships. (1993)

by P S Resnik
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 272
Next 10 →

Using information content to evaluate semantic similarity in a taxonomy

by Philip Resnik - In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95 , 1995
"... philip.resnikfleast.sun.com This paper presents a new measure of semantic similarity in an IS-A taxonomy, based on the notion of information content. Experimental evaluation suggests that the measure performs encouragingly well (a correlation of r = 0.79 with a benchmark set of human similarity judg ..."
Abstract - Cited by 1097 (8 self) - Add to MetaCart
philip.resnikfleast.sun.com This paper presents a new measure of semantic similarity in an IS-A taxonomy, based on the notion of information content. Experimental evaluation suggests that the measure performs encouragingly well (a correlation of r = 0.79 with a benchmark set of human similarity judgments, with an upper bound of r = 0.90 for human subjects performing the same task), and significantly better than the traditional edge counting approach (r = 0.66). 1

Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language

by Philip Resnik , 1999
"... This article presents a measure of semantic similarityinanis-a taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The a ..."
Abstract - Cited by 609 (9 self) - Add to MetaCart
This article presents a measure of semantic similarityinanis-a taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The article presents algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguity, along with experimental results demonstrating their e#ectiveness. 1. Introduction Evaluating semantic relatedness using network representations is a problem with a long history in arti#cial intelligence and psychology, dating back to the spreading activation approach of Quillian #1968# and Collins and Loftus #1975#. Semantic similarity represents a special case of semantic relatedness: for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar. Rada et al. #Rada, Mili, Bicknell, & Blett...
(Show Context)

Citation Context

...t the case, the "credit" for each noun occurrence would be distributed over all concepts for the noun, and the counts normalized across the entire taxonomy to sum to 1. (That is the approach=-= taken in Resnik, 1993-=-a, also see Resnik, 1998b for discussion.) In assigning taxonomic probabilities for purposes of measuring semantic similarity, the present model associates a separate, binomially distributed random va...

Introduction to the special issue on word sense disambiguation

by Nancy Ide - Computational Linguistics J , 1998
"... ..."
Abstract - Cited by 265 (4 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...A hierarchy, but other relational links as well. Resnik (1995a) draws on his body of earlier work on WordNet, in which he explores a measure of semantic similarity for words in the WordNet hierarchy (=-=Resnik 1993-=-a, 1993b, 1995a). He computes the shared information content of words, which is a measure of the specificity of the concept that subsumes the words in the WordNet IS-A hierarchy--the more specific the...

Automatic extraction of subcategorization from corpora

by Ted Briscoe, John Carroll - In Proceedings of the 5th ACL Conference on Applied Natural Language Processing , 1997
"... We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verb ..."
Abstract - Cited by 237 (7 self) - Add to MetaCart
We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount 1. 1

A Probabilistic Model of Lexical and Syntactic Access and Disambiguation

by Daniel Jurafsky - COGNITIVE SCIENCE , 1995
"... The problems of access -- retrieving linguistic structure from some mental grammar -- and disambiguation -- choosing among these structures to correctly parse ambiguous linguistic input -- are fundamental to language understanding. The literature abounds with psychological results on lexical access, ..."
Abstract - Cited by 207 (12 self) - Add to MetaCart
The problems of access -- retrieving linguistic structure from some mental grammar -- and disambiguation -- choosing among these structures to correctly parse ambiguous linguistic input -- are fundamental to language understanding. The literature abounds with psychological results on lexical access, the access of idioms, syntactic rule access, parsing preferences, syntactic disambiguation, and the processing of garden-path sentences. Unfortunately, it has been difficult to combine models which account for these results to build a general, uniform model of access and disambiguation at the lexical, idiomatic, and syntactic levels. For example psycholinguistic theories of lexical access and idiom access and parsing theories of syntactic rule access have almost no commonality in methodology or coverage of psycholinguistic data. This paper presents a single probabilistic algorithm which models both the access and disambiguation of linguistic knowledge. The algorithm is based on a parallel parser which ranks constructions for access, and interpretations for disambiguation, by their conditional probability. Low-ranked constructions and interpretations are pruned through beam-search; this pruning accounts, among other things, for the garden-path effect. I show that this motivated probabilistic treatment accounts for a wide variety of psycholinguistic results, arguing for a more uniform representation of linguistic knowledge and for the use of probabilisticallyenriched grammars and interpreters as models of human knowledge of and processing of language.

Building a large-scale knowledge base for machine translation

by Kevin Knight, Steve K. Luk - In Proceedings of AAAI , 1994
"... Knowledge-based machine translation (KBMT) systems have achieved excellent results in constrained domains, but have not yet scaled up to newspaper text. The reason is that knowledge resources (lexicons, grammar rules, world models) must be painstakingly handcrafted from scratch. One of the hypothese ..."
Abstract - Cited by 200 (6 self) - Add to MetaCart
Knowledge-based machine translation (KBMT) systems have achieved excellent results in constrained domains, but have not yet scaled up to newspaper text. The reason is that knowledge resources (lexicons, grammar rules, world models) must be painstakingly handcrafted from scratch. One of the hypotheses being tested in the PAN-GLOSS machine translation project is whether or not these resources can be semi-automatically acquired on a very large scale. This paper focuses on the construction of a large ontology (or knowledge base, or world model) for supporting KBMT. It contains representations for some 70,000 commonly encountered objects, processes, qualities, and relations. The ontology was constructed by merging various online dictionaries, semantic networks, and bilingual resources, through semi-automatic methods. Some of these methods (e.g., conceptual matching of semantic taxonomies) are broadly applicable to problems of importing/exporting knowledge from one KB to another. Other methods (e.g., bilingual matching) allow a knowledge engineer to build up an index to a KB in a second language, such as Spanish or Japanese.
(Show Context)

Citation Context

...di erent taxonomic organizations that can be merged into one lattice structure. Another bene t of merging resources is that it makes subsequent knowlA more elaborate scheme would weight links, as in (=-=Resnik 1993-=-).edge acquisition easier. For example, in designing the Bilingual Match algorithm, we were free to make use of information in both WordNet and LDOCE. Related Work and Future Work Automatic dictionar...

Word sense disambiguation: a survey

by Roberto Navigli - ACM COMPUTING SURVEYS , 2009
"... Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the ..."
Abstract - Cited by 191 (16 self) - Add to MetaCart
Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the motivations for solving the ambiguity of words and provide a description of the task. We overview supervised, unsupervised, and knowledge-based approaches. The assessment of WSD systems is discussed in the context of the Senseval/Semeval campaigns, aiming at the objective evaluation of systems participating in several different disambiguation tasks. Finally, applications, open problems, and future directions are discussed.

Decision Lists For Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French

by David Yarowsky , 1994
"... This paper presents a statistical decision procedure for lexical ambiguity resolution. The algorithm exploits both local syntactic patterns and more distant collocational evidence, generating an efficient, effective, and highly perspicuous recipe for resolving a given ambiguity. By identifying and u ..."
Abstract - Cited by 191 (3 self) - Add to MetaCart
This paper presents a statistical decision procedure for lexical ambiguity resolution. The algorithm exploits both local syntactic patterns and more distant collocational evidence, generating an efficient, effective, and highly perspicuous recipe for resolving a given ambiguity. By identifying and utilizing only the single best disambiguating evidence in a target context, the algorithm avoids the problematic complex modeling of statistical dependencies. Although directly applicable to a wide class of ambiguities, the algorithm is described and evaluated in a realistic case study, the problem of restoring missing accents in Spanish and French text. Current accuracy exceeds 99% on the full task, and typically is over 90% for even the most difficult ambiguities.

Discovering Conceptual Relations from Text

by Alexander Maedche , Steffen Staab , 2000
"... Non-taxonomic relations between concepts appear as a major building block in common ontology definitions. In fact, their definition consumes much of the time needed for engineering an ontology. We here describe ..."
Abstract - Cited by 189 (19 self) - Add to MetaCart
Non-taxonomic relations between concepts appear as a major building block in common ontology definitions. In fact, their definition consumes much of the time needed for engineering an ontology. We here describe

Using the web to obtain frequencies for unseen bigrams

by Frank Keller, Mirella Lapata - COMPUT. LINGUIST , 2003
"... This paper shows that the web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verb-object bigrams from the web by querying a search engine. We evaluate this method by demonstrating: (a) ..."
Abstract - Cited by 171 (2 self) - Add to MetaCart
This paper shows that the web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verb-object bigrams from the web by querying a search engine. We evaluate this method by demonstrating: (a) a high correlation between web frequencies and corpus frequencies; (b) a reliable correlation between web frequencies and plausibility judgments; (c) a reliable correlation between web frequencies and frequencies recreated using class-based smoothing; (d) a good performance of web frequencies in a pseudo-disambiguation task.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University