Results 1 - 10
of
44
Overview of the Fifth Text REtrieval Conference (TREC-5)
- PROCEEDINGS OF THE FIFTH TEXT RETRIEVAL CONFERENCE (TREC-5
, 1997
"... ..."
CubeSVD: A Novel Approach to Personalized Web Search
- In Proc. of the 14 th International World Wide Web Conference (WWW
, 2005
"... As the competition of Web search market increases, there is a high demand for personalized Web search to conduct retrieval incorporating Web users' information needs. This paper focuses on utilizing clickthrough data to improve Web search. Since millions of searches are conducted everyday, a search ..."
Abstract
-
Cited by 47 (3 self)
- Add to MetaCart
As the competition of Web search market increases, there is a high demand for personalized Web search to conduct retrieval incorporating Web users' information needs. This paper focuses on utilizing clickthrough data to improve Web search. Since millions of searches are conducted everyday, a search engine accumulates a large volume of clickthrough data, which records who submits queries and which pages he/she clicks on. The clickthrough data is highly sparse and contains di#erent types of objects (user, query and Web page), and the relationships among these objects are also very complicated. By performing analysis on these data, we attempt to discover Web users' interests and the patterns that users locate information. In this paper, a novel approach CubeSVD is proposed to improve Web search. The clickthrough data is represented by a 3-order tensor, on which we perform 3-mode analysis using the higher-order singular value decomposition technique to automatically capture the latent factors that govern the relations among these multi-type objects: users, queries and Web pages. A tensor reconstructed based on the CubeSVD analysis reflects both the observed interactions among these objects and the implicit associations among them. Therefore, Web search activities can be carried out based on CubeSVD analysis. Experimental evaluations using a real-world data set collected from an MSN search engine show that CubeSVD achieves encouraging search results in comparison with some standard methods.
Mining Anchor Text for Query Refinement
- WWW2004
, 2004
"... When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automat ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automatically generating refinements or related terms to queries by mining anchor text for a large hypertext document collection. We show that the usage of anchor text as a basis for query refinement produces high quality refinement suggestions that are significantly better in terms of perceived usefulness compared to refinements that are derived using the document content. Furthermore, our study suggests that anchor text refinements can also be used to augment traditional query refinement algorithms based on query logs, since they typically differ in coverage and produce different refinements. Our results are based on experiments on an anchor text collection of a large corporate intranet.
Community Search Assistant
- In Artificial Intelligence for Web Search
, 2000
"... This paper describes a new software agent, the community search assistant, which recommends related searches to users of search engines. The community search assistant enables communities of users to search in a collaborative fashion. All queries submitted by the community are stored in the for ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
This paper describes a new software agent, the community search assistant, which recommends related searches to users of search engines. The community search assistant enables communities of users to search in a collaborative fashion. All queries submitted by the community are stored in the form of a graph. Links are made between queries that are found to be related. Users can peruse the network of related queries in an ordered way: following a path from a first cousin, to a second cousin to a third cousin, etc. to a set of search results. The first key idea behind the use of query graphs is that the determination of relatedness depends on the documents returned by the queries, not on the actual terms in the queries themselves. The second key idea is that the construction of the query graph transforms single user usage of information networks (e.g. search) into collaborative usage: all users can tap into the knowledge base of queries submitted by others. Introduction ...
Query Expansion using Associated Queries
- IN PROC. INT. CONF. ON INFORMATION AND KNOWLEDGE MANAGEMENT
, 2003
"... Hundreds of millions of users each day use web search engines to meet their information needs. Advances in web search e#ectiveness are therefore perhaps the most significant public outcomes of IR research. Query expansion is one such method for improving the e#ectiveness of ranked retrieval by ad ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
Hundreds of millions of users each day use web search engines to meet their information needs. Advances in web search e#ectiveness are therefore perhaps the most significant public outcomes of IR research. Query expansion is one such method for improving the e#ectiveness of ranked retrieval by adding additional terms to a query. In previous approaches to query expansion, the additional terms are selected from highly ranked documents returned from an initial retrieval run. We propose a new method of obtaining expansion terms, based on selecting terms from past user queries that are associated with documents in the collection. Our
Relevant term suggestion in interactive Web search based on contextual information in query session logs
- Journal of the American Society for Information Science and Technology
, 2003
"... This paper proposes an effective term suggestion approach to interactive Web search. Conventional approaches to making term suggestions involve extracting co-occurring keyterms from highly ranked retrieved documents. Such approaches must deal with term extraction difficulties and interference from i ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
This paper proposes an effective term suggestion approach to interactive Web search. Conventional approaches to making term suggestions involve extracting co-occurring keyterms from highly ranked retrieved documents. Such approaches must deal with term extraction difficulties and interference from irrelevant documents, and, more importantly, have difficulty extracting terms that are conceptually related but do not frequently co-occur in documents. In this paper, we present a new, effective log-based approach to relevant term extraction and term suggestion. Using this approach, the relevant terms suggested for a user query are those that cooccur in similar query sessions from search engine logs, rather than in the retrieved documents. In addition, the suggested terms in each interactive search step can be organized according to its relevance to the entire query session, rather than to the most recent single query as in conventional approaches. The proposed approach was tested using a proxy server log containing about two million query transactions submitted to search engines in Taiwan. The obtained experimental results show that the proposed approach can provide organized and highly relevant terms, and can exploit the contextual information in a user’s query session to make more effective suggestions. 1.
User Profile Modeling and Applications to Digital Libraries
- In: Proceedings of the Third European Conference on Research and Advanced Technology for Digital Libraries
, 1999
"... . The ultimate goal of an information provider is to satisfy the user information needs. That is, to provide the user with the right information, at the right time, through the right means. A prerequisite for developing personalised services is to rely on user profiles representing users' inform ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
. The ultimate goal of an information provider is to satisfy the user information needs. That is, to provide the user with the right information, at the right time, through the right means. A prerequisite for developing personalised services is to rely on user profiles representing users' information needs. In this paper we will first address the issue of presenting a general user profile model. Then, the general user profile model will be customised for digital libraries users. 1 Introduction It is widely recognised that the internet is growing rapidly in terms of the number of users accessing it, the amount of information created and accessible through it and the number of times users use it in order to satisfy their information needs. This has made it increasingly di#cult for individuals to control and e#ectively seek for information among the potentially infinite number of information sources available on the internet. Ironically, just as more and more users are getting onl...
Resolving Translation Ambiguity and Target Polysemy in Cross-Language Information Retrieval
- IN PROCEEDINGS OF 37 TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1999
"... This paper deals with translation ambiguity and target polysemy problems together. Two monolingual balanced corpora are employed to learn word co-occurrence for translation ambiguity resolution, and augmented translation restrictions for. target polysemy resolution. Experiments show that the model a ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
This paper deals with translation ambiguity and target polysemy problems together. Two monolingual balanced corpora are employed to learn word co-occurrence for translation ambiguity resolution, and augmented translation restrictions for. target polysemy resolution. Experiments show that the model achieves 62.92% of monolingual information retrieval, and is 40.80% addition to the select-all model. Combining the target polysemy resolution, the retrieval performance is about 10.11% increase to the model resolving translation ambiguity only.
Web Page Classification: Features and Algorithms
, 2007
"... Classification of web page content is essential to many tasks in web information retrieval such as maintaining web directories and focused crawling. The uncontrolled nature of web content presents additional challenges to web page classification as compared to traditional text classification, but th ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Classification of web page content is essential to many tasks in web information retrieval such as maintaining web directories and focused crawling. The uncontrolled nature of web content presents additional challenges to web page classification as compared to traditional text classification, but the interconnected nature of hypertext also provides features that can assist the process. As we review work in web page classification, we note the importance of these web-specific features and algorithms, describe state-of-the-art practices, and track the underlying assumptions behind the use of information from neighboring pages. 1
Improving retrieval feedback with multiple term-ranking function combination
- ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2002
"... In this paper we consider methods for automatic query expansion from top retrieved documents (i.e., retrieval feedback) which make use of various functions for scoring expansion terms within Rocchio’s classical reweighting scheme. An analytical comparison shows that the retrieval performance of meth ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
In this paper we consider methods for automatic query expansion from top retrieved documents (i.e., retrieval feedback) which make use of various functions for scoring expansion terms within Rocchio’s classical reweighting scheme. An analytical comparison shows that the retrieval performance of methods based on distinct term-scoring functions is comparable on the whole query set but considerably differs on single queries, consistent with the fact that the ordered sets of expansion terms suggested for each query by the different functions are largely uncorrelated. Motivated by these findings, we argue that the results of multiple functions can be merged, by analogy with ensembling classifiers, and present a simple combination technique based on the rank values of the suggested terms. The combined retrieval feedback method is effective not only with respect to unexpanded queries but also to any individual method, with notable improvements on the system’s precision. Furthermore, the combined method is robust with respect to variation of experimental parameters and it is beneficial even when the same information needs are expressed with shorter queries.

