Results 1 -
6 of
6
Shuffling a stacked deck: the case for partially randomized ranking of search engine results
- In Proc. 31st International Conference on Very Large Databases (VLDB
, 2005
"... In-degree, PageRank, number of visits and other measures of Web page popularity significantly influence the ranking of search results by modern search engines. The assumption is that popularity is closely correlated with quality, a more elusive concept that is difficult to measure directly. Unfortun ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
In-degree, PageRank, number of visits and other measures of Web page popularity significantly influence the ranking of search results by modern search engines. The assumption is that popularity is closely correlated with quality, a more elusive concept that is difficult to measure directly. Unfortunately, the correlation between popularity and quality is very weak for newly-created pages that have yet to receive many visits and/or in-links. Worse, since discovery of new content is largely done by querying search engines, and because users usually focus their attention on the top few results, newly-created but high-quality pages are effectively “shut out, ” and it can take a very long time before they become popular. We propose a simple and elegant solution to this problem: the introduction of a controlled amount of randomness into search result ranking methods. Doing so offers new pages a chance to prove their worth, although clearly using too much randomness will degrade result quality and annul any benefits achieved. Hence there is a tradeoff between exploration to estimate the quality of new pages and exploitation of pages already known to be of high quality. We study this tradeoff both analytically and via simulation, in the context of an economic objective function based on aggregate result quality amortized over time. We show that a modest amount of randomness leads to improved search results. 1
T-rank: Time-aware authority ranking
- In WAW
, 2004
"... Abstract. The link structure of the web is analyzed to measure the authority of pages, which can be taken into account for ranking query results. Due to the enormous dynamics of the web, with millions of pages created, updated, deleted, and linked to every day, temporal aspects of web pages and link ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Abstract. The link structure of the web is analyzed to measure the authority of pages, which can be taken into account for ranking query results. Due to the enormous dynamics of the web, with millions of pages created, updated, deleted, and linked to every day, temporal aspects of web pages and links are crucial factors for their evaluation. Users are interested in important pages (i.e., pages with high authority score) but are equally interested in the recency of information. Time—and thus the freshness of web content and link structure—emanates as a factor that should be taken into account in link analysis when computing the importance of a page. So far only minor effort has been spent on the integration of temporal aspects into link-analysis techniques. In this paper we introduce T-Rank Light and T-Rank, two link-analysis approaches that take into account the temporal aspects freshness (i.e., timestamps of most recent updates) and activity (i.e., update rates) of pages and links. Experimental results show that T-Rank Light and T-Rank can produce better rankings of web pages. 1.
Vetting the Links of the Web
"... Many web links mislead human surfers and automated crawlers because they point to changed content, out-of-date information, or invalid URLs. It is a particular problem for large, well-known directories such as the dmoz Open Directory Project, which maintains links to representative and authoritative ..."
Abstract
- Add to MetaCart
Many web links mislead human surfers and automated crawlers because they point to changed content, out-of-date information, or invalid URLs. It is a particular problem for large, well-known directories such as the dmoz Open Directory Project, which maintains links to representative and authoritative external web pages within their various topics. Therefore, such sites involve many editors to manually revisit and revise links that have become out-ofdate. To remedy this situation, we propose the novel web mining task of identifying outdated links on the web. We build a general classification model, primarily using local and global temporal features extracted from historical content, topic, link and time-focused changes over time. We evaluate our system via five-fold crossvalidation on more than fifteen thousand ODP external links selected from thirteen top-level categories. Our system can predict the actions of ODP editors more than 75 % of the time. Our models and predictions could be useful for various applications that depend on analysis of web links, including ranking and crawling.
Freshness Matters: In Flowers, Food, and Web Authority
"... The collective contributions of billions of users across the globe each day result in an ever-changing web. In verticals like news and real-time search, recency is an obvious significant factor for ranking. However, traditional link-based web ranking algorithms typically run on a single web snapshot ..."
Abstract
- Add to MetaCart
The collective contributions of billions of users across the globe each day result in an ever-changing web. In verticals like news and real-time search, recency is an obvious significant factor for ranking. However, traditional link-based web ranking algorithms typically run on a single web snapshot without concern for user activities associated with the dynamics of web pages and links. Therefore, a stale page popular many years ago may still achieve a high authority score due to its accumulated in-links. To remedy this situation, we propose a temporal web link-based ranking scheme, which incorporates features from historical author activities. We quantify web page freshness over time from page and in-link activity, and design a web surfer model that incorporates web freshness, based on a temporal web graph composed of multiple web snapshots at different time points. It includes authority propagation among snapshots, enabling link structures at distinct time points to influence each other when estimating web page authority. Experiments on a real-world archival web corpus show our approach improves upon PageRank in both relevance and freshness of the search results.
Information Mediation in the Presence of Constraints and Uncertainties
, 2008
"... Submitted in partial ful llment of the requirements ..."
A Social Platform for Bipartite Student-Lecturer Ranking
"... Conventional lecturer rating platforms base their rankings on students ’ ratings alone, without taking the students ’ performances into account. The information about students ’ performances, however, is readily available, since lecturers rate students by grading them in examinations. Bringing toget ..."
Abstract
- Add to MetaCart
Conventional lecturer rating platforms base their rankings on students ’ ratings alone, without taking the students ’ performances into account. The information about students ’ performances, however, is readily available, since lecturers rate students by grading them in examinations. Bringing together lecturer-given grades and student-given ratings leads to a bipartite rating graph that can be used with ranking techniques which rank students and lecturers alike, and feed back student and lecturer ranks to govern the respective influence of their ratings. We propose an algorithm called evalRank, based on PageRank and developed for bipartite ranking situations, and adapt this general evalRank algorithm to the specific intricacies of Student-Lecturer ranking. The proposed ranking is then an eigenvector centrality measure providing the mutual reinforcement typical of such measures. evalRank differs from Page-Rank in the inclusion of (normalized) edge weights and a modification of how random leaps are applied, as an adaption to the bipartite input graph. We further present a fully functional Web Application, eval, which provides an intuitive Web Platform for the rating and ranking of lecturers by students, using student grade transcripts

