• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Disambiguating nouns, verbs, and adjectives using automatically acquired selectional preferences (2003)

by D McCarthy, J Carroll
Venue:Language and Cognitive Processes
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 22
Next 10 →

A structured vector space model for word meaning in context

by Katrin Erk, Sebastian Padó , 2008
"... We address the task of computing vector space representations for the meaning of word occurrences, which can vary widely according to context. This task is a crucial step towards a robust, vector-based compositional account of sentence meaning. We argue that existing models for this task do not take ..."
Abstract - Cited by 30 (5 self) - Add to MetaCart
We address the task of computing vector space representations for the meaning of word occurrences, which can vary widely according to context. This task is a crucial step towards a robust, vector-based compositional account of sentence meaning. We argue that existing models for this task do not take syntactic structure sufficiently into account. We present a novel structured vector space model that addresses these issues by incorporating the selectional preferences for words’ argument positions. This makes it possible to integrate syntax into the computation of word meaning in context. In addition, the model performs at and above the state of the art for modeling the contextual adequacy of paraphrases. 1

Improving automatic query classification via semi-supervised learning

by Steven M. Beitzel, Eric C. Jensen, Ophir Frieder - In The Fifth IEEE International Conference on Data Mining , 2005
"... Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose web search systems. Such classification becomes critical if the system is to return results not just from a general web collection but from topic-specific back-end databases as well. ..."
Abstract - Cited by 30 (4 self) - Add to MetaCart
Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose web search systems. Such classification becomes critical if the system is to return results not just from a general web collection but from topic-specific back-end databases as well. Maintaining sufficient classification recall is very difficult as web queries are typically short, yielding few features per query. This feature sparseness coupled with the high query volumes typical for a large-scale search service makes manual and supervised learning approaches alone insufficient. We use an application of computational linguistics to develop an approach for mining the vast amount of unlabeled data in web query logs to improve automatic topical web query classification. We show that our approach in combination with manual matching and supervised learning allows us to classify a substantially larger proportion of queries than any single technique. We examine the performance of each approach on a real web query stream and show that our combined method accurately classifies 46 % of queries, outperforming the recall of best single approach by nearly 20%, with a 7 % improvement in overall effectiveness. 1.

Word sense disambiguation: a survey

by Roberto Navigli - ACM COMPUTING SURVEYS , 2009
"... Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the ..."
Abstract - Cited by 28 (9 self) - Add to MetaCart
Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the motivations for solving the ambiguity of words and provide a description of the task. We overview supervised, unsupervised, and knowledge-based approaches. The assessment of WSD systems is discussed in the context of the Senseval/Semeval campaigns, aiming at the objective evaluation of systems participating in several different disambiguation tasks. Finally, applications, open problems, and future directions are discussed.

A Simple, Similarity-based Model for Selectional Preferences

by Katrin Erk , 2007
"... We propose a new, simple model for the auto-matic induction of selectional preferences, using corpus-based semantic similarity metrics. Fo-cusing on the task of semantic role labeling, we compute selectional preferences for seman-tic roles. In evaluations the similarity-based model shows lower error ..."
Abstract - Cited by 23 (1 self) - Add to MetaCart
We propose a new, simple model for the auto-matic induction of selectional preferences, using corpus-based semantic similarity metrics. Fo-cusing on the task of semantic role labeling, we compute selectional preferences for seman-tic roles. In evaluations the similarity-based model shows lower error rates than both Resnik’s WordNet-based model and the EM-based clus-tering model, but has coverage problems.

A Large Subcategorization Lexicon for Natural Language Processing Applications

by Anna Korhonen, Yuval Krymolowski, Ted Briscoe - In Proceedings of LREC , 2006
"... We introduce a large computational subcategorization lexicon which includes subcategorization frame (SCF) and frequency information for 6,397 English verbs. This extensive lexicon was acquired automatically from five corpora and the Web using the current version of the comprehensive subcategorizatio ..."
Abstract - Cited by 21 (9 self) - Add to MetaCart
We introduce a large computational subcategorization lexicon which includes subcategorization frame (SCF) and frequency information for 6,397 English verbs. This extensive lexicon was acquired automatically from five corpora and the Web using the current version of the comprehensive subcategorization acquisition system of Briscoe and Carroll (1997). The lexicon is provided freely for research use, along with a script which can be used to filter and build sub-lexicons suited for different natural language processing (NLP) purposes. Documentation is also provided which explains each sub-lexicon option and evaluates its accuracy. 1.

Automatic Classification of Web Queries Using Very Large Unlabeled Query Logs

by Steven M. Beitzel, Eric C. Jensen, David D. Lewis, David D. Lewis Consulting, David D. Lewis Consulting
"... Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose Web search systems. Such classification becomes critical if the system must route queries to a subset of topic-specific and resource-constrained back-end databases. Successful query c ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose Web search systems. Such classification becomes critical if the system must route queries to a subset of topic-specific and resource-constrained back-end databases. Successful query classification poses a challenging problem, as Web queries are short, thus providing few features. This feature sparseness, coupled with the constantly changing distribution and vocabulary of queries, hinders traditional text classification. We attack this problem by combining multiple classifiers, including exact lookup and partial matching in databases of manually classified frequent queries, linear models trained by supervised learning, and a novel approach based on mining selectional preferences from a large unlabeled query log. Our approach classifies queries without using external sources of information, such as online Web directories or the contents of retrieved pages, making it viable for use in demanding operational environments, such as large-scale Web search services. We evaluate our approach using a large sample of queries from an operational Web search engine and show that our combined method increases recall by nearly 40 % over the best single method while maintaining adequate precision. Additionally, we compare our results to those from the 2005 KDD Cup and find that we perform competitively despite our operational restrictions. This suggests it is possible to topically classify a significant portion of the query stream without requiring external sources of information, allowing for deployment in operationally restricted environments.

Knowledge-rich Word Sense Disambiguation Rivaling Supervised Systems

by Simone Paolo Ponzetto, Roberto Navigli, Sapienza Università Di Roma
"... One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when p ..."
Abstract - Cited by 6 (4 self) - Add to MetaCart
One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguation algorithms compete with state-of-the-art supervised WSD systems in a coarse-grained all-words setting and outperform them on gold-standard domain-specific datasets. 1

Hunting Elusive Metaphors Using Lexical Resources

by Saisuresh Krishnakumaran
"... In this paper we propose algorithms to automatically classify sentences into metaphoric or normal usages. Our algorithms only need the WordNet and bigram counts, and does not require training. We present empirical results on a test set derived from the Master Metaphor List. We also discuss issues th ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
In this paper we propose algorithms to automatically classify sentences into metaphoric or normal usages. Our algorithms only need the WordNet and bigram counts, and does not require training. We present empirical results on a test set derived from the Master Metaphor List. We also discuss issues that make classification of metaphors a tough problem in general. 1

Verb Sense Disambiguation Using Selectional Preferences Extracted with a State-of-the-art Semantic Role Labeler

by Patrick Ye, Timothy Baldwin
"... This paper investigates whether multisemantic-role (MSR) based selectional preferences can be used to improve the performance of supervised verb sense disambiguation. Unlike conventional selectional preferences which are extracted from parse trees based on hand-crafted rules, and only include the di ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
This paper investigates whether multisemantic-role (MSR) based selectional preferences can be used to improve the performance of supervised verb sense disambiguation. Unlike conventional selectional preferences which are extracted from parse trees based on hand-crafted rules, and only include the direct subject or the direct object of the verbs, the MSR based selectional preferences to be presented in this paper are extracted from the output of a state-of-the-art semantic role labeler and incorporate a much richer set of semantic roles. The performance of the MSR based selectional preferences is evaluated on two distinct datasets: the verbs from the lexical sample task of SENSEVAL-2, and the verbs from a movie script corpus. We show that the MSR based features can indeed improve the performance of verb sense disambiguation. 1

Features and Categories Design for the English-Russian Transfer Model

by Elena Kozerenko
"... Abstract. The paper focuses on the role of features for the implementation of the transfer-based machine translation systems. The semantic content of syntactic structures is established via the contrastive study of the English and Russian language systems and parallel texts analysis. The notion of c ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract. The paper focuses on the role of features for the implementation of the transfer-based machine translation systems. The semantic content of syntactic structures is established via the contrastive study of the English and Russian language systems and parallel texts analysis. The notion of cognitive transfer is employed which means that a language unit or structure can be singled out for transfer when there exists at least one language unit or structure with a similar meaning in the target language The approach taken is aimed at providing computational tractability and portability of linguistic presentation solutions for various language engineering purposes.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University