Results 1 - 10
of
18
From “dango” to “japanese cakes”: Query reformulation models and patterns. Submitted for publication
, 2008
"... Abstract—Understanding query reformulation patterns is a key step towards next generation web search engines: it can help improving users ’ web-search experience by predicting their intent, and thus helping them to locate information more effectively. As a step in this direction, we build an accurat ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Abstract—Understanding query reformulation patterns is a key step towards next generation web search engines: it can help improving users ’ web-search experience by predicting their intent, and thus helping them to locate information more effectively. As a step in this direction, we build an accurate model for classifying user query reformulations into broad classes (generalization, specialization, error correction or parallel move), achieving 92 % accuracy. We apply the model to automatically label two large query logs, creating annotated queryflow graphs. We study the resulting reformulation patterns, finding results consistent with previous studies done on smaller manually annotated datasets, and discovering new interesting patterns, including connections between reformulation types and topical categories. Finally, applying our findings to a third query log that is publicly available for research purposes, we demonstrate that our reformulation classifier leads to improved recommendations in a query recommendation system. I.
An Optimization Framework for Query Recommendation ∗
"... Query recommendation is an integral part of modern search engines. The goal of query recommendation is to facilitate users while searching for information. Query recommendation also allows users to explore concepts related to their information needs. In this paper, we present a formal treatment of t ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Query recommendation is an integral part of modern search engines. The goal of query recommendation is to facilitate users while searching for information. Query recommendation also allows users to explore concepts related to their information needs. In this paper, we present a formal treatment of the problem of query recommendation. In our framework we model the querying behavior of users by a probabilistic reformulation graph, or query-flow graph [Boldi et al. CIKM 2008]. A sequence of queries submitted by a user can be seen as a path on this graph. Assigning score values to queries allows us to define suitable utility functions and to consider the expected utility achieved by a reformulation path on the query-flow graph. Providing recommendations can be seen as adding shortcuts in the query-flow graph that“nudge”the reformulation paths of users, in such a way that users are more likely to follow paths with larger expected utility. We discuss in detail the most important questions that arise in the proposed framework. In particular, we provide examples of meaningful utility functions to optimize, we discuss how to estimate the effect of recommendations on the reformulation probabilities, we address the complexity of the optimization problems that we consider, we suggest efficient algorithmic solutions, and we validate our models and algorithms with extensive experimentation. Our techniques can be applied to other scenarios where user behavior can be modeled as a Markov process.
Do You Want to Take Notes? Identifying Research Missions in Yahoo! Search Pad
- WWW 2010
, 2010
"... Addressing user’s information needs has been one of the main goals of Web search engines since their early days. In some cases, users cannot see their needs immediately answered by search results, simply because these needs are too complex and involve multiple aspects that are not covered by a singl ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Addressing user’s information needs has been one of the main goals of Web search engines since their early days. In some cases, users cannot see their needs immediately answered by search results, simply because these needs are too complex and involve multiple aspects that are not covered by a single Web or search results page. This typically happens when users investigate a certain topic in domains such as education, travel or health, which often require collecting facts and information from many pages. We refer to this type of activities as“research missions”. These research missions account for 10 % of users ’ sessions and more than 25% of all query volume, as verified by a manual analysis that was conducted by Yahoo! editors. We demonstrate in this paper that such missions can be automatically identified on-the-fly, as the user interacts with the search engine, through careful runtime analysis of query flows and query sessions. The on-the-fly automatic identification of research missions has been implemented in Search Pad, a novel Yahoo! application that was launched in 2009, and that we present in this paper. Search Pad helps users keeping trace of results they have consulted. Its novelty however is that unlike previous notes taking products, it is automatically triggered only when the system decides, with a fair level of confidence, that the user is undertaking a research mission and thus is in the right context for gathering notes. Beyond the Search Pad specific application, we believe that changing the level of granularity of query modeling, from an isolated query to a list of queries pertaining to the same research missions, so as to better reflect a certain type of information needs, can be beneficial in a number of other Web search applications. Session-awareness is growing and it is likely to play, in the near future, a fundamental role in many on-line tasks: this paper presents a first step on this path.
Jigs and Lures: Associating Web Queries with Structured Entities
"... We propose methods for estimating the probability that an entity from an entity database is associated with a web search query. Association is modeled using a query entity click graph, blending general query click logs with vertical query click logs. Smoothing techniques are proposed to address the ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We propose methods for estimating the probability that an entity from an entity database is associated with a web search query. Association is modeled using a query entity click graph, blending general query click logs with vertical query click logs. Smoothing techniques are proposed to address the inherent data sparsity in such graphs, including interpolation using a query synonymy model. A large-scale empirical analysis of the smoothing techniques, over a 2-year click graph collected from a commercial search engine, shows significant reductions in modeling error. The association models are then applied to the task of recommending products to web queries, by annotating queries with products from a large catalog and then mining queryproduct associations through web search session analysis. Experimental analysis shows that our smoothing techniques improve coverage while keeping precision stable, and overall, that our top-performing model affects 9% of general web queries with 94 % precision. 1
Aging Effects on Query Flow Graphs for Query Suggestion
"... World Wide Web content continuously grows in size and importance. Furthermore, users ask Web search engines to satisfy increasingly disparate information needs. New techniques and tools are constantly developed aimed at assisting users in the interaction with the Web search engine. Query recommender ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
World Wide Web content continuously grows in size and importance. Furthermore, users ask Web search engines to satisfy increasingly disparate information needs. New techniques and tools are constantly developed aimed at assisting users in the interaction with the Web search engine. Query recommender systems suggesting interesting queries to users are an example of such tools. Most query recommendation techniques are based on the knowledge of the behaviors of past users of the search engine recorded in query logs. A recent query-log mining approach for query recommendation is based on Query Flow Graphs (QFG). In this paper we propose an evaluation of the effects of time on this query recommendation model. As users interests change over time, the knowledge extracted from query logs may suffer an aging effect as new interesting topics appear. In order to validate experimentally this hypothesis, we build different query flow graphs from the queries belonging to a large query log of a real-world search engine. Each query flow graph is built on distinct query log segments. Then, we generate recommendations on different sets of queries. Results are assessed both by means of human judgments and by using an automatic evaluator showing that the models inexorably age.
Query Similarity by Projecting the Query-Flow Graph
"... Defining a measure of similarity between queries is an interesting and difficult problem. A reliable query-similarity measure can be used in a variety of applications such as query recommendation, query expansion, and advertising. In this paper, we exploit the information present in query logs in or ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Defining a measure of similarity between queries is an interesting and difficult problem. A reliable query-similarity measure can be used in a variety of applications such as query recommendation, query expansion, and advertising. In this paper, we exploit the information present in query logs in order to develop a measure of semantic similarity between queries. Our approach relies on the concept of the query-flow graph, a graph-based representation of a query log. The query-flow graph aggregates query reformulations from many users: nodes in the graph represent queries, and two queries are connected if they are likely to appear as part of the same search goal. Our query-similarity measure is obtained by projecting the graph (or appropriate subgraphs extracted from it) on a low-dimensional Euclidean space. Our experiments show that the measure we obtain captures a notion of semantic similarity between queries and it is useful for diversifying query recommendations.
The Effects of Time on Query Flow Graph-based Models for Query Suggestion
"... A recent query-log mining approach for query recommendation is based on Query Flow Graphs, a markov-chain representation of the query reformulation process followed by users of Web Search Engines trying to satisfy their information needs. In this paper we aim at extending this model by providing met ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
A recent query-log mining approach for query recommendation is based on Query Flow Graphs, a markov-chain representation of the query reformulation process followed by users of Web Search Engines trying to satisfy their information needs. In this paper we aim at extending this model by providing methods for dealing with evolving data. In fact, users ’ interests change over time, and the knowledge extracted from query logs may suffer an aging effect as new interesting topics appear. Starting from this observation validated experimentally, we introduce a novel algorithm for updating an existing query flow graph. The proposed solution allows the recommendation model to be kept always updated without reconstructing it from scratch every time, by incrementally merging efficiently the past and present data. Categories and Subject Descriptors
Context-Sensitive Query Auto-Completion ∗
, 2011
"... Query auto completion is known to provide poor predictions of the user’s query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user’s recent queries, can be used to improve the prediction quality considerably even for such short pref ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Query auto completion is known to provide poor predictions of the user’s query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user’s recent queries, can be used to improve the prediction quality considerably even for such short prefixes. We propose a context-sensitive query auto completion algorithm, NearestCompletion, which outputs the completions of the user’s input that are most similar to the context queries. To measure similarity, we represent queries and contexts as high-dimensional term-weighted vectors and resort to cosine similarity. The mapping from queries to vectors is done through a new query expansion technique that we introduce, which expands a query by traversing the query recommendation tree rooted at the query. In order to evaluate our approach, we performed extensive experimentation over the public AOL query log. We demonstrate that when the recent user’s queries are relevant to the current query she is typing, then after typing a single character, NearestCompletion’s MRR is 48 % higher relative to the MRR of the standard MostPopularCompletion algorithm on average. When the context is irrelevant, however, NearestCompletion’s MRR is essentially zero. To mitigate this problem, we propose HybridCompletion, which is a hybrid of NearestCompletion with MostPopularCompletion. HybridCompletion is shown to dominate both NearestCompletion and MostPopularCompletion, achieving a total improvement of 31.5 % in MRR relative to MostPopular-Completion on average.
HIERARCHICAL TAXONOMY EXTRACTION BY MINING TOPICAL QUERY SESSIONS
"... Web search, query log, hyponymy relations, query reformulation, automatic taxonomy extraction. Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Web search, query log, hyponymy relations, query reformulation, automatic taxonomy extraction. Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research has shown that it is possible to extract concept taxonomies from full text documents, while other scholars have proposed methods to obtain similar queries from query logs. We propose a mixture of both lines of research, that is, mining query logs not to find related queries nor query hierarchies but actual term taxonomies. In this first approach we have researched the feasibility of finding hyponymy relations between terms or noun-phrases by exploiting specialization search patterns in topical sessions, obtaining encouraging preliminary results. 1
The Semantics of Query Modification
"... We present a method that exploits ‘linked data ’ to determine semantic relations between consecutive user queries. Our method maps queries onto concepts in linked data and searches the linked data graph for direct or indirect relations between the concepts. By comparing relations between large numbe ..."
Abstract
- Add to MetaCart
We present a method that exploits ‘linked data ’ to determine semantic relations between consecutive user queries. Our method maps queries onto concepts in linked data and searches the linked data graph for direct or indirect relations between the concepts. By comparing relations between large numbers of user queries, we identify semantic modification patterns. The application of this method to the logs of an image search engine revealed interesting usage patterns, such as that users often search for two entities sharing a property (e.g., two players from the same team). These patterns can be used to generate query suggestions. Results of preliminary experiments show that the patterns enable us to generate suggestions for more queries than a method purely based on search-log statistics.

