Results 1 - 10
of
18
Improving Browsing in Digital Libraries with Keyphrase Indexes
, 1998
"... Browsing accounts for much of people's interaction with digital libraries, but it is poorly supported by standard search engines. Conventional systems often operate at the wrong level, indexing words when people think in terms of topics, and returning documents when people want a broader view. As a ..."
Abstract
-
Cited by 49 (9 self)
- Add to MetaCart
Browsing accounts for much of people's interaction with digital libraries, but it is poorly supported by standard search engines. Conventional systems often operate at the wrong level, indexing words when people think in terms of topics, and returning documents when people want a broader view. As a result, users cannot easily determine what is in a collection, how well a particular topic is covered, or what kinds of queries will provide useful results. We have built
Mining Anchor Text for Query Refinement
- WWW2004
, 2004
"... When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automat ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automatically generating refinements or related terms to queries by mining anchor text for a large hypertext document collection. We show that the usage of anchor text as a basis for query refinement produces high quality refinement suggestions that are significantly better in terms of perceived usefulness compared to refinements that are derived using the document content. Furthermore, our study suggests that anchor text refinements can also be used to augment traditional query refinement algorithms based on query logs, since they typically differ in coverage and produce different refinements. Our results are based on experiments on an anchor text collection of a large corporate intranet.
User Profile Modeling and Applications to Digital Libraries
- In: Proceedings of the Third European Conference on Research and Advanced Technology for Digital Libraries
, 1999
"... . The ultimate goal of an information provider is to satisfy the user information needs. That is, to provide the user with the right information, at the right time, through the right means. A prerequisite for developing personalised services is to rely on user profiles representing users' inform ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
. The ultimate goal of an information provider is to satisfy the user information needs. That is, to provide the user with the right information, at the right time, through the right means. A prerequisite for developing personalised services is to rely on user profiles representing users' information needs. In this paper we will first address the issue of presenting a general user profile model. Then, the general user profile model will be customised for digital libraries users. 1 Introduction It is widely recognised that the internet is growing rapidly in terms of the number of users accessing it, the amount of information created and accessible through it and the number of times users use it in order to satisfy their information needs. This has made it increasingly di#cult for individuals to control and e#ectively seek for information among the potentially infinite number of information sources available on the internet. Ironically, just as more and more users are getting onl...
HelpfulMed: Intelligent Searching for Medical Information over the Internet
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY
, 2003
"... Medical professionals and researchers need information from reputable sources to accomplish their work. Unfortunately, the Web has a large number of documents that are irrelevant to their work, even those documents that purport to be "medically-related." This paper describes an architecture designed ..."
Abstract
-
Cited by 17 (13 self)
- Add to MetaCart
Medical professionals and researchers need information from reputable sources to accomplish their work. Unfortunately, the Web has a large number of documents that are irrelevant to their work, even those documents that purport to be "medically-related." This paper describes an architecture designed to integrate advanced searching and indexing algorithms, an automatic thesaurus, or "concept space", and Kohonen-based Self-Organizing Map (SAM) technologies to provide searchers with fine-grained results. Initial results indicate that these systems provide complementary retrieval functionalities. HelpfulMed allows users to search not only Web pages and other online databases, but also allows them to build searches through the use of an automatic thesaurus and browse a graphical display of medical-related topics. Evaluation results
A unified and discriminative model for query refinement
- In SIGIR ‘08
, 2008
"... This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typically includes a number of tasks such as spelling error correction, word splitting, word merging, phrase segmentation, word ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typically includes a number of tasks such as spelling error correction, word splitting, word merging, phrase segmentation, word stemming, and acronym expansion. In previous research, such tasks were addressed separately or through employing generative models. This paper proposes employing a unified and discriminative model for query refinement. Specifically, it proposes a Conditional Random Field (CRF) model suitable for the problem, referred to as Conditional Random Field for Query Refinement (CRF-QR). Given a sequence of query words, CRF-QR predicts a sequence of refined query words as well as corresponding refinement operations. In that sense, CRF-QR differs greatly from conventional CRF models. Two types of CRF-QR models, namely a basic model and an extended model are introduced. One merit of employing CRF-QR is that different refinement tasks can be performed simultaneously and thus the accuracy of refinement can be enhanced. Furthermore, the advantages of discriminative models over generative models can be fully leveraged. Experimental results demonstrate that CRF-QR can significantly outperform baseline methods. Furthermore, when CRF-QR is used in web search, a significant improvement of relevance can be obtained.
A Fuzzy Ontology-based Abstract Search Engine and Its User Studies
- IN THE 10TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS
, 2001
"... Query refinement can help users find information on the Internet more effectively. This feature has been implemented in a Personalized Abstract Search Services (PASS) system, a Web-based, domain-specific search engine for searching abstracts of research papers. The system uses a fuzzy ontology of ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Query refinement can help users find information on the Internet more effectively. This feature has been implemented in a Personalized Abstract Search Services (PASS) system, a Web-based, domain-specific search engine for searching abstracts of research papers. The system uses a fuzzy ontology of term associations to support the feature. The ontology is automatically built in two stages using information obtained from the system's collection. A preliminary user study reveals that query refinement is one of the most important features of the system.
Context Sensitive Stemming for Web Search
, 2007
"... Traditionally, stemming has been applied to Information Retrieval tasks by transforming words in documents to the their root form before indexing, and applying a similar transformation to query terms. Although it increases recall, this naive strategy does not work well for Web Search since it lowers ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Traditionally, stemming has been applied to Information Retrieval tasks by transforming words in documents to the their root form before indexing, and applying a similar transformation to query terms. Although it increases recall, this naive strategy does not work well for Web Search since it lowers precision and requires a significant amount of additional computation. In this paper, we propose a context sensitive stemming method that addresses these two issues. Two unique properties make our approach feasible for Web Search. First, based on statistical language modeling, we perform context sensitive analysis on the query side. We accurately predict which of its morphological variants is useful to expand a query term with before submitting the query to the search engine. This dramatically reduces the number of bad expansions, which in turn reduces the cost of additional computation and improves the precision at the same time. Second, our approach performs a context sensitive document matching for those expanded variants. This conservative strategy serves as a safeguard against spurious stemming, and it turns out to be very important for improving precision. Using word pluralization handling as an example of our stemming approach, our experiments on a major Web search engine show that stemming only 29 % of the query traffic, we can improve relevance as measured by average Discounted Cumulative Gain (DCG5) by 6.1 % on these queries and 1.8 % over all query traffic.
Patterns and sequences of multiple query reformulations in Web searching: A preliminary study
- Proceedings of ASIS&T Annual Meeting, Washington DC, Nov 2001
, 2001
"... While some studies have investigated query reformulation in traditional online systems, there has been little understanding of how users reformulate their queries multiple times within search sessions on the Web. This paper reports on patterns and sequences of query reformulation based on query logs ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
While some studies have investigated query reformulation in traditional online systems, there has been little understanding of how users reformulate their queries multiple times within search sessions on the Web. This paper reports on patterns and sequences of query reformulation based on query logs from a Web search engine. The data set contained only search sessions in which multiple query modifications were made. The analysis of data resulted in three facets of reformulation: content, format, and resource. Each facet was further categorized by 10 sub-facets. The results show that while most query reformulation involves content changes, about 15 % of reformulation is related to format modifications. Six patterns of query reformulation emerged as a result of sequence analysis: specified reformulation, parallel reformulation, generalized reformulation, dynamic reformulation, format reformulation, and alternative reformulation. Each pattern is discussed with definitions and examples. The results indicate that both planned and situated aspects affect query reformulation in Web searching. The implications for new Web search engine tools and features are also discussed.
A WordNet-Based Interface to Internet Search Engines
- In Proceedings of the FLAIRS-98
, 1998
"... A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as AltaVista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is st ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as AltaVista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is still in its infancy, and there is need for considerable improvement. Some inherent difficulties are: (1) the web information is diverse and highly unstructured, (2) the size of information is large and it grows at an exponential rate, and (3) the current search engine technology is still rudimentary. While the first two issues are more profound and require long term solutions, it may be possible to develop software around the search engines to improve the quality of the information retrieved. In this paper we present a natural language interface system to a search engine and discuss some of the results obtained. Introduction A main problem with the current search engines is the large volume ...
Beyond Information Searching and Browsing: Acquiring Knowledge from Digital Libraries
- Information Processing & Management
, 2001
"... As one of the most complex and advanced forms of Internet information systems, digital libraries serve as an increasingly important channel to a vast array of information sources and services. However, from the standpoint of satisfying human's information needs, the current digital library systems s ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
As one of the most complex and advanced forms of Internet information systems, digital libraries serve as an increasingly important channel to a vast array of information sources and services. However, from the standpoint of satisfying human's information needs, the current digital library systems suffer from the following two shortcomings: (i) inadequate strategic level cognition support; (ii) inadequate knowledge sharing facilities. In this paper, we introduce a two-layered digital library architecture to support different levels of human cognitive acts. The model moves beyond simple information searching and browsing across multiple repositories, to inquiry of knowledge. To address users' high-order cognitive requests, we propose an information space, consisting of a knowledge subspace and a document subspace. A formal description of the knowledge subspace for knowledge sharing and dissemination, as well as mechanisms for constructing the two subspaces, are particularly dis...

