Results 1 - 10
of
41
Query expansion using lexical-semantic relations
- In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 1994
"... Applications such as office automation, news filtering, help facilities in complex systems, and the like require the ability to retrieve documents from full-text databases where vocabulary problems can be particularly severe. Experiments performed on small collections with single-domain thesauri sug ..."
Abstract
-
Cited by 395 (1 self)
- Add to MetaCart
Applications such as office automation, news filtering, help facilities in complex systems, and the like require the ability to retrieve documents from full-text databases where vocabulary problems can be particularly severe. Experiments performed on small collections with single-domain thesauri suggest that expanding query vectors with words that are lexically related to the original query words can ameliorate some of the problems of mismatched vocabularies. This paper examines the utility of lexical query expansion in the large, diverse TREC collection. Concepts are represented by WordNet synonym sets and are expanded by following the typed links included in Word Net. Experimental results show this query expansion technique makes little difference in retrieval effectiveness if the original queries are relatively complete descriptions of the information being sought even when the concepts to be expanded are selected by hand. Less well developed queries can be significantly improved by expansion of hand-chosen concepts. However, an automatic procedure that can approximate the set of hand picked synonym sets has yet to be devised, and expanding by the synonym sets that are automatically generated can degrade retrieval performance. 1
Indexing with WordNet synsets can improve text retrieval
, 1998
"... The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) ff WordNet synsets are chosen as the indexing space, instead of word forms. This resuit is obtained for a manually disambiguated test collection (of queries and documents) deriv ..."
Abstract
-
Cited by 174 (4 self)
- Add to MetaCart
(Show Context)
The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) ff WordNet synsets are chosen as the indexing space, instead of word forms. This resuit is obtained for a manually disambiguated test collection (of queries and documents) derived from the SEMCOR semantic concordance. The sensitiv- ity of retrieval performance to (automatic) disambiguation errors when indexing documents is also measured. Finally, it is observed that ff queries are not disambiguated, indexing by synsets performs (at best) only as good as standard word indexing.
Meteor-S Web Service annotation framework
- In Proceedings of the 13th International Conference on the World Wide Web
, 2004
"... The World Wide Web is emerging not only as an infrastructure for data, but also for a broader variety of resources that are increasingly being made available as Web services. Relevant current standards like UDDI, WSDL, and SOAP are in their fledgling years and form the basis of making Web services a ..."
Abstract
-
Cited by 147 (16 self)
- Add to MetaCart
(Show Context)
The World Wide Web is emerging not only as an infrastructure for data, but also for a broader variety of resources that are increasingly being made available as Web services. Relevant current standards like UDDI, WSDL, and SOAP are in their fledgling years and form the basis of making Web services a workable and broadly adopted technology. However, realizing the fuller scope of the promise of Web services and associated service oriented architecture will requite further technological advances in the areas of service interoperation, service discovery, service composition, and process orchestration. Semantics, especially as supported by the use of ontologies, and related Semantic Web technologies, are likely to provide better qualitative and scalable solutions to these requirements. Just as semantic annotation of data in the Semantic Web is the first critical step to better search, integration and analytics over heterogeneous data, semantic annotation of Web services is an equally critical first step to achieving the above promise. Our approach is to work with existing Web services technologies and combine them with ideas from the Semantic Web to create a better framework for Web service discovery and composition. In this paper we present MWSAF (METEOR-S Web Service Annotation Framework), a framework for semi-automatically marking up Web service descriptions with ontologies. We have developed algorithms to match and annotate WSDL files with relevant ontologies. We use domain ontologies to categorize Web services into domains. An empirical study of our approach is presented to help evaluate its performance.
An effective approach to document retrieval via utilizing wordnet and recognizing phrases
- In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
, 2004
"... Noun phrases in queries are identified and classified into four types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content words in the phrase are within a window of a certain size. The window sizes for different types of phrases are different ..."
Abstract
-
Cited by 89 (11 self)
- Add to MetaCart
Noun phrases in queries are identified and classified into four types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content words in the phrase are within a window of a certain size. The window sizes for different types of phrases are different and are determined using a decision tree. Phrases are more important than individual terms. Consequently, documents in response to a query are ranked with matching phrases given a higher priority. We utilize WordNet to disambiguate word senses of query terms. Whenever the sense of a query term is determined, its synonyms, hyponyms, words from its definition and its compound words are considered for possible additions to the query. Experimental results show that our approach yields between 23 % and 31% improvements over the best-known results on the TREC 9, 10 and 12 collections for short (title only) queries, without using Web data.
On Expanding Query Vectors with Lexically Related Words
, 1994
"... Experiments performed on small collections suggest that expanding query vectors with words that are lexically related to the original query words can improve retrieval effectiveness. Prior experiments using WordNet to automatically expand vectors in the large TREC-1 collection were inconclusive rega ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
(Show Context)
Experiments performed on small collections suggest that expanding query vectors with words that are lexically related to the original query words can improve retrieval effectiveness. Prior experiments using WordNet to automatically expand vectors in the large TREC-1 collection were inconclusive regarding effectiveness gains from lexically related words since any such effects were dominated by the choice of words to expand. This paper specifically investigates the effect of expansion by selecting query concepts to be expanded by hand. Concepts are represented by WordNet synonym sets and are expanded by following the typed links included in WordNet. Experimental results suggest that this query expansion technique makes little difference in retrieval effectiveness within the TREC environment, presumably because the TREC topic statements provide such a rich description of the information being sought. 1 Introduction The IR group at Siemens Corporate Research is investigating how concept ...
Disambiguating Highly Ambiguous Words
- Computational Linguistics
, 1998
"... A word sense disambiguator that is able to distinguish among the many senses of common words that are found in general-purpose, broad-coverage lexicons would be useful. For example, experiments have shown that, given accurate sense disambiguation, the lexical relations encoded in lexicons such as Wo ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
(Show Context)
A word sense disambiguator that is able to distinguish among the many senses of common words that are found in general-purpose, broad-coverage lexicons would be useful. For example, experiments have shown that, given accurate sense disambiguation, the lexical relations encoded in lexicons such as WordNet can be exploited to improve the effectiveness of information retrieval systems. This paper describes a classifier whose accuracy may be sufficient for such a purpose. The classifier combines the output of a neural network that learns topical context with the output of a network that learns local context to distinguish among the senses of highly ambiguous words. The accuracy of the classifier is tested on three words, the noun line, the verb serve, and the adjective hard; the classifier has an average accuracy of 87%, 90%, and 81%, respectively, when forced to choose a sense for all test cases. When the classifier is not forced to choose a sense and is trained on a subset of the available senses, it rejects test cases containing unknown senses as well as test cases it would misclassify if forced to select a sense. Finally, when there are few labeled training examples available, we describe an extension of our training method that uses information extracted from unlabeled examples to improve classification accuracy. 1.
Towards Building Contextual Representations of Word Senses Using Statistical Models
- Corpus Processing for Lexical Acquisition
, 1993
"... Autom&tlc corpus-based sense resolution, or sense disambiguation, techniques tend to focus either on very local context or on topical context. Both components are needed for word sense resolution. A contextual representation of& word sense consists oftopical context and local context. Our ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
Autom&tlc corpus-based sense resolution, or sense disambiguation, techniques tend to focus either on very local context or on topical context. Both components are needed for word sense resolution. A contextual representation of& word sense consists oftopical context and local context. Our goal is to construct contextual representations by automatically extracting topical and local information from textual corpora. We review an experiment evalugting three statistical classifiers that &utomatically extr&ct topical context. An experiment designed to examine human subject performance with similar input is described. Finally, we investigate & method for automatically extracting local context from a corpus. Preliminary results show improved perfor- mance.
Word sense disambiguation in queries
- In ACM Conference on Information and Knowledge Management (CIKM2005
, 2005
"... This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyp ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
(Show Context)
This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyponyms, and its domains, can be used for word sense disambiguation. By comparing these pieces of information associated with the words which form a phrase, it may be possible to assign senses to these words. If the above disambiguation fails, then other query words, if exist, are used, by going through exactly the same process. If the sense of a query word cannot be determined in this manner, then a guess of the sense of the word is made, if the guess has at least 50 % chance of being correct. If no sense of the word has 50 % or higher chance of being used, then we apply a Web search to assist in the word sense disambiguation process. Experimental results show that our approach has 100% applicability and 90 % accuracy on the most recent robust track of TREC collection of 250 queries. We combine this disambiguation algorithm to our retrieval system to examine the effect of word sense disambiguation in text retrieval. Experimental results show that the disambiguation algorithm together with other components of our retrieval system yield a result which is 13.7 % above that produced by the same system but without the disambiguation, and 9.2 % above that produced by using Lesk’s algorithm. Our retrieval effectiveness is 7 % better than the best reported result in the literature.
Solving The Word Mismatch Problem Through Automatic Text Analysis
, 1997
"... Information Retrieval (IR) is concerned with locating documents that are relevant for a user's information need or query from a large collection of documents. A fundamental problem for information retrieval is word mismatch. A query is usually a short and incomplete description of the underlyin ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Information Retrieval (IR) is concerned with locating documents that are relevant for a user's information need or query from a large collection of documents. A fundamental problem for information retrieval is word mismatch. A query is usually a short and incomplete description of the underlying information need. The users of IR systems and the authors of the documents often use different words to refer to the same concepts. This thesis addresses the word mismatch problem through automatic text analysis. We investigate two text analysis techniques, corpus analysis and local context analysis, and apply them in two domains of word mismatch, stemming and general query expansion. Experimental results show that these techniques ca...