Results 1 - 10
of
61
Expected Reciprocal Rank for Graded Relevance
- CIKM'09, NOVEMBER 2–6, 2009, HONG KONG, CHINA.
, 2009
"... While numerous metrics for information retrieval are available in the case of binary relevance, there is only one commonly used metric for graded relevance, namely the Discounted Cumulative Gain (DCG). A drawback of DCG is its additive nature and the underlying independence assumption: a document in ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
While numerous metrics for information retrieval are available in the case of binary relevance, there is only one commonly used metric for graded relevance, namely the Discounted Cumulative Gain (DCG). A drawback of DCG is its additive nature and the underlying independence assumption: a document in a given position has always the same gain and discount independently of the documents shown above it. Inspired by the “cascade ” user model, we present a new editorial metric for graded relevance which overcomes this difficulty and implicitly discounts documents which are shown below very relevant documents. More precisely, this new metric is defined as the expected reciprocal length of time that the user will take to find a relevant document. This can be seen as an extension of the classical reciprocal rank to the graded relevance case and we call this metric Expected Reciprocal Rank (ERR). We conduct an extensive evaluation on the query logs of a commercial search engine and show that ERR correlates better with clicks metrics than other editorial metrics.
Sponsored Search Auctions with Markovian Users
"... Abstract. Sponsored search involves running an auction among advertisers who bid in order to have their ad shown next to search results for specific keywords. The most popular auction for sponsored search is the “Generalized Second Price ” (GSP) auction where advertisers are assigned to slots in the ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Abstract. Sponsored search involves running an auction among advertisers who bid in order to have their ad shown next to search results for specific keywords. The most popular auction for sponsored search is the “Generalized Second Price ” (GSP) auction where advertisers are assigned to slots in the decreasing order of their score, which is defined as the product of their bid and click-through rate. One of the main advantages of this simple ranking is that bidding strategy is intuitive: to move up to a more prominent slot on the results page, bid more. This makes it simple for advertisers to strategize. However this ranking only maximizes efficiency under the assumption that the probability of a user clicking on an ad is independent of the other ads shown on the page. We study a Markovian user model that does not make this assumption. Under this model, the most efficient assignment is no longer a simple ranking function as in GSP. We show that the optimal assignment can be found efficiently (even in near-linear time). As a result of the more sophisticated structure of the optimal assignment, bidding dynamics become more complex: indeed it is no longer clear that bidding more moves one higher on the page. Our main technical result is that despite the added complexity of the bidding dynamics, the optimal assignment has the property that ad position is still monotone in bid. Thus even in this richer user model, our mechanism retains the core bidding dynamics of the GSP auction that make it useful for advertisers. 1
A cascade model for externalities in sponsored search
- In ACM EC-08 Workshop on Ad Auctions
, 2008
"... Abstract. One of the most important yet insufficiently studied issues in online advertising is the externality effect among ads: the value of an ad impression on a page is affected not just by the location that the ad is placed in, but also by the set of other ads displayed on the page. For instance ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Abstract. One of the most important yet insufficiently studied issues in online advertising is the externality effect among ads: the value of an ad impression on a page is affected not just by the location that the ad is placed in, but also by the set of other ads displayed on the page. For instance, a high quality competing ad can detract users from another ad, while a low quality ad could cause the viewer to abandon the page altogether. In this paper, we propose and analyze a model for externalities in sponsored search ads. Our model is based on the assumption that users will visually scan the list of ads from the top to the bottom. After each ad, they make independent random decisions with ad-specific probabilities on whether to continue scanning. We then generalize the model in two ways: allowing for multiple separate blocks of ads, and allowing click probabilities to explicitly depend on ad positions as well. For the most basic model, we present a polynomial-time incentive-compatible auction mechanism for allocating and pricing ad slots. For the generalizations, we give approximation algorithms for the allocation of ads. 1
From “dango” to “japanese cakes”: Query reformulation models and patterns. Submitted for publication
, 2008
"... Abstract—Understanding query reformulation patterns is a key step towards next generation web search engines: it can help improving users ’ web-search experience by predicting their intent, and thus helping them to locate information more effectively. As a step in this direction, we build an accurat ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Abstract—Understanding query reformulation patterns is a key step towards next generation web search engines: it can help improving users ’ web-search experience by predicting their intent, and thus helping them to locate information more effectively. As a step in this direction, we build an accurate model for classifying user query reformulations into broad classes (generalization, specialization, error correction or parallel move), achieving 92 % accuracy. We apply the model to automatically label two large query logs, creating annotated queryflow graphs. We study the resulting reformulation patterns, finding results consistent with previous studies done on smaller manually annotated datasets, and discovering new interesting patterns, including connections between reformulation types and topical categories. Finally, applying our findings to a third query log that is publicly available for research purposes, we demonstrate that our reformulation classifier leads to improved recommendations in a query recommendation system. I.
Externalities in Keyword Auctions: an Empirical and Theoretical Assessment ∗
"... It is widely believed that the value of acquiring a slot in a sponsored search list (that comes along with the organic links in a search engine’s result page) highly depends on who else is shown in the other sponsored positions. To capture such externality effects, we consider a model of keyword adv ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
It is widely believed that the value of acquiring a slot in a sponsored search list (that comes along with the organic links in a search engine’s result page) highly depends on who else is shown in the other sponsored positions. To capture such externality effects, we consider a model of keyword advertising where bidders participate in a Generalized Second Price (GSP) auction and users perform ordered search (they browse from the top to the bottom of the sponsored list and make their clicking decisions slot by slot). Our contribution is twofold: first, we use impression and click data from Microsoft Live to estimate the ordered search model. With these estimates in hand, we are able to assess how the click-through rate of an ad is affected by the user’s click history and by the other competing links. Further, we compare the clicking predictions of our ordered search model to those of the most widely used model of user behavior: the separable click-through rate model. Second, we study complete information Nash equilibria of the GSP under different scoring rules. First, we characterize the efficient and revenue-maximizing complete information Nash equilibrium (under any scoring rule) and show that such an equilibrium can be implemented with any set of advertisers if and only if a particular weighting rule that combines click-through rates and continuation probabilities is used. Interestingly, this is the same ranking rule derived in [11] for solving the efficient allocation problem. On the negative side, we show that there is no scoring rule that implements an efficient equilibrium with VCG payments (VCG equilibrium) for all profiles of valuations and search parameters. This result extends [8], who argue that the rank-by-revenue GSP does not possess a VCG equilibrium. 1.
Classification-enhanced ranking
, 2010
"... Many have speculated that classifying web pages can improve a search engine’s ranking of results. Intuitively results should be more relevant when they match the class of a query. We present a simple framework for classification-enhanced ranking that uses clicks in combination with the classificatio ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Many have speculated that classifying web pages can improve a search engine’s ranking of results. Intuitively results should be more relevant when they match the class of a query. We present a simple framework for classification-enhanced ranking that uses clicks in combination with the classification of web pages to derive a class distribution for the query. We then go on to define a variety of features that capture the match between the class distributions of a web page and a query, the ambiguity of a query, and the coverage of a retrieved result relative to a query’s set of classes. Experimental results demonstrate that a ranker learned with these features significantly improves ranking over a competitive baseline. Furthermore, our methodology is agnostic with respect to the classification space and can be used to derive query classes for a variety of different taxonomies.
Efficient multiple-click models in web search
- In WSDM ’09: Proceedings of the Second International Conference on Web Search and Data Mining
, 2009
"... Many tasks that leverage web search users ’ implicit feedback rely on a proper and unbiased interpretation of user clicks. Previous eye-tracking experiments and studies on explaining position-bias of user clicks provide a spectrum of hypotheses and models on how an average user examines and possibly ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Many tasks that leverage web search users ’ implicit feedback rely on a proper and unbiased interpretation of user clicks. Previous eye-tracking experiments and studies on explaining position-bias of user clicks provide a spectrum of hypotheses and models on how an average user examines and possibly clicks web documents returned by a search engine with respect to the submitted query. In this paper, we attempt to close the gap between previous work, which studied how to model a single click, and the reality that multiple clicks on web documents in a single result page are not uncommon. Specifically, we present two multiple-click models: the independent click model (ICM) which is reformulated from previous work, and the dependent click model (DCM) which takes into consideration dependencies between multiple clicks. Both models can be efficiently learned with linear time and space complexities. More importantly, they can be incrementally updated as new click logs flow in. These are well-demanded properties in reality. We systematically evaluate the two models on click logs obtained in July 2008 from a major commercial search engine. The data set, after preprocessing, contains over 110 thousand distinct queries and 8.8 million query sessions. Extensive experimental studies demonstrate the gain of modeling multiple clicks and their dependencies. Finally, we note that since our experimental setup does not rely on tweaking search result rankings, it can be easily adopted by future studies.
Entropy-biased Models for Query Representation On The Click Graph
, 2009
"... Query log analysis has received substantial attention in recent years, in which the click graph is an important technique for describing the relationship between queries and URLs. State-of-the-art approaches based on the raw click frequencies for modeling the click graph, however, are not noise-elim ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Query log analysis has received substantial attention in recent years, in which the click graph is an important technique for describing the relationship between queries and URLs. State-of-the-art approaches based on the raw click frequencies for modeling the click graph, however, are not noise-eliminated. Nor do they handle heterogeneous query-URL pairs well. In this paper, we investigate and develop a novel entropy-biased framework for modeling click graphs. The intuition behind this model is that various query-URL pairs should be treated differently, i.e., common clicks on less frequent but more specific URLs are of greater value than common clicks on frequent and general URLs. Based on this intuition, we utilize the entropy information of the URLs and introduce a new concept, namely the inverse query frequency (IQF), to weigh the importance (discriminative ability) of a click on a certain URL. The IQF weighting scheme is never explicitly explored or statistically examined for any bipartite graphs in the information retrieval literature. We not only formally define and quantify this scheme, but also incorporate it with the click frequency and user frequency information on the click graph for an effective query representation. To illustrate our methodology, we conduct experiments with the AOL query log data for query similarity analysis and query suggestion tasks. Experimental results demonstrate that considerable improvements in performance are obtained with our entropy-biased models.
Internet Ad Auctions: Insights and Directions
"... Abstract. On the Internet, there are advertisements (ads) of different kinds: image, text, video and other specially marked objects that are distinct from the underlying content of the page. There is an industry behind the management of such ads, and they face a number of algorithmic challenges. Thi ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. On the Internet, there are advertisements (ads) of different kinds: image, text, video and other specially marked objects that are distinct from the underlying content of the page. There is an industry behind the management of such ads, and they face a number of algorithmic challenges. This note will present a small selection of such problems, some insights and open research directions. 1
BBM: Bayesian Browsing Model from Petabyte-scale Data
"... Given a quarter of petabyte click log data, how can we estimate the relevance of each URL for a given query? In this paper, we propose the Bayesian Browsing Model (BBM), a new modeling technique with following advantages: (a) it does exact inference; (b) it is single-pass and parallelizable; (c) it ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Given a quarter of petabyte click log data, how can we estimate the relevance of each URL for a given query? In this paper, we propose the Bayesian Browsing Model (BBM), a new modeling technique with following advantages: (a) it does exact inference; (b) it is single-pass and parallelizable; (c) it is effective. We present two sets of experiments to test model effectiveness and efficiency. On the first set of over 50 million search instances of 1.1 million distinct queries, BBM outperforms the state-of-the-art competitor by 29.2 % in loglikelihood while being 57 times faster. On the second clicklog set, spanning a quarter of petabyte data, we showcase the scalability of BBM: we implemented it on a commercial MapReduce cluster, and it took only 3 hours to compute the relevance for 1.15 billion distinct query-URL pairs.

