## PageRank as a Function of the Damping Factor (2005)

### Cached

### Download Links

- [www.www2005.org]
- [www2005.org]
- [vigna.dsi.unimi.it]
- [vigna.dsi.unimi.it]
- [math.liu.se]
- [vigna.di.unimi.it]
- DBLP

### Other Repositories/Bibliography

Citations: | 36 - 9 self |

### BibTeX

@MISC{Boldi05pagerankas,

author = {Paolo Boldi and Massimo Santini and Sebastiano Vigna},

title = {PageRank as a Function of the Damping Factor},

year = {2005}

}

### Years of Citing Articles

### OpenURL

### Abstract

PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor # that spreads uniformly part of the rank. The choice of # is eminently empirical, and in most cases the original suggestion # = 0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in # was discovered to be useful in link-spam detection [21]. Moreover, an analytical justification of the value chosen for # is still missing. In this paper, we give the first mathematical analysis of PageRank when # changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of # close to 1 do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and an extension of the Power Method that approximates them with convergence O for the k-th derivative. Finally, we show a tight connection between iterated computation and analytical behaviour by proving that the k-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank.

### Citations

2730 | Indexing by Latent Semantic Analysis
- Dumais, T, et al.
- 1990
(Show Context)
Citation Context ...wise said, it can be computed offline using only the web graph 1 structure and then used later, as users submit queries to the search engine, typically aggregated with other, query-dependent rankings =-=[4, 12, 16]-=-. One suggestive way to describe the idea behind PageRank is as follows: consider a random surfer that starts from a random page, and at every time chooses the next page by clicking on one of the link... |

2716 | Authoritative sources in a hyperlinked environment
- Kleinberg
- 1999
(Show Context)
Citation Context ...wise said, it can be computed offline using only the web graph 1 structure and then used later, as users submit queries to the search engine, typically aggregated with other, query-dependent rankings =-=[4, 12, 16]-=-. One suggestive way to describe the idea behind PageRank is as follows: consider a random surfer that starts from a random page, and at every time chooses the next page by clicking on one of the link... |

2144 | The PageRank Citation Ranking: Bringing Order to the Web
- Page, Brin, et al.
- 1998
(Show Context)
Citation Context ...y the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank. 1 Introduction PageRank =-=[17]-=- is one of the most important ranking techniques used in today’s search engines. Not only is PageRank a simple, robust and reliable way to measure the importance of web pages [3], but it is also compu... |

222 | SALSA: The stochastic approach for link-structure analysis
- Lempel, Moran
- 2001
(Show Context)
Citation Context ...wise said, it can be computed offline using only the web graph 1 structure and then used later, as users submit queries to the search engine, typically aggregated with other, query-dependent rankings =-=[4, 12, 16]-=-. One suggestive way to describe the idea behind PageRank is as follows: consider a random surfer that starts from a random page, and at every time chooses the next page by clicking on one of the link... |

174 | Representing Web Graphs
- Raghavan, Molina
- 2003
(Show Context)
Citation Context ... terminal component, so the first statement of the theorem will hold. This means that most nodes x will be such that r ∗ x = 0. In particular, this will be true of all the nodes in the core component =-=[13]-=-: this result is somehow surprising, because it means that many important Web pages (that are contained in the core component) will have rank 0 in the limit (see, for instance, node 0 in our example).... |

161 | The WebGraph framework I: Compression techniques
- Boldi, Vigna
- 2004
(Show Context)
Citation Context ...ve/convex) to show that the approximation is excellent in all these cases. For this experiment we used a 41 291 594-nodes snapshot of the Italian web gathered by UbiCrawler [1] andindexed by WebGraph =-=[2]-=-. 6The coefficients are vectors, because we are approximating a vector function. 13s1e-07s1e-06s1e-05s0.0001s0.001s0.01s0.1s1s10s100s1000s0s10s20s30s40s50s60s70 01 23 4 Figure 3: The convergence speed... |

144 | C.D.: “Deeper Inside PageRank
- Langville, Meyer
(Show Context)
Citation Context ...in the first papers about PageRank: yet, not only PageRank changes significantly when α is modified [19, 18], but also the relative ordering of nodes determined by PageRank can be radically different =-=[14]-=-. The original value suggested by Brin and Page (α = 0.85) is the most common choice. Intuitively, 1 − α is an amount of ranking that we agree to give uniformly at each page. This amount will be then ... |

133 | Extrapolation methods for accelerating pagerank computations
- Kamvar, Haveliwala, et al.
- 2003
(Show Context)
Citation Context ...close to 1. The Power Method converges more and more slowly [9] as α → 1 − , a fact that also influences the other methods used to compute PageRank (which are, after all, variants of the Power Method =-=[17, 7, 6, 15, 11, 10]-=-). Indeed, the number of iterations required could in general be bounded using the separation between the first and the second eigenvalue, but unfortunately the separation can be abysmally small if α ... |

132 | Efficient Computation of PageRank
- Haveliwala
- 1999
(Show Context)
Citation Context ...close to 1. The Power Method converges more and more slowly [9] as α → 1 − , a fact that also influences the other methods used to compute PageRank (which are, after all, variants of the Power Method =-=[17, 7, 6, 15, 11, 10]-=-). Indeed, the number of iterations required could in general be bounded using the separation between the first and the second eigenvalue, but unfortunately the separation can be abysmally small if α ... |

129 | Exploiting the block structure of the web for computing PageRank
- Kamvar, Haveliwala, et al.
- 2003
(Show Context)
Citation Context ...close to 1. The Power Method converges more and more slowly [9] as α → 1 − , a fact that also influences the other methods used to compute PageRank (which are, after all, variants of the Power Method =-=[17, 7, 6, 15, 11, 10]-=-). Indeed, the number of iterations required could in general be bounded using the separation between the first and the second eigenvalue, but unfortunately the separation can be abysmally small if α ... |

99 | UbiCrawler: a scalable fully distributed web crawler. Software: Practice and Experience
- Boldi, Codenotti, et al.
- 2002
(Show Context)
Citation Context .../decreasing, unimodal concave/convex) to show that the approximation is excellent in all these cases. For this experiment we used a 41 291 594-nodes snapshot of the Italian web gathered by UbiCrawler =-=[1]-=- and indexed by WebGraph [2]. 6 The coefficients are vectors, because we are approximating a vector function. 13 k=1s1000 100 10 1 0.1 0.01 0.001 0.0001 0 1e-05 1 2 1e-06 3 4 1e-07 0 10 20 30 40 50 60... |

94 | Ranking the Web Frontier
- Eiron, McCurley
(Show Context)
Citation Context ...often, which justifies the definition. However, we also allow the surfer to restart with probability 1 − α from another node chosen randomly and uniformly, instead of following a link. As remarked in =-=[5]-=-, a significant part of the current knowledge about PageRank is scattered through the research laboratories of large search engines, and its analysis “has remained largely in the realm of trade secret... |

54 |
Making Eigenvector-Based Reputation Systems Robust to Collusion
- Zhang, Goel, et al.
(Show Context)
Citation Context ...ost cases the original suggestion α = 0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in α was discovered to be useful in link-spam detection =-=[21]-=-. Moreover, an analytical justification of the value chosen for α is still missing. In this paper, we give the first mathematical analysis of PageRank when α changes. In particular, we show that, cont... |

48 |
Non-negative matrices and Markov chains, Springer Series in Statistics
- Seneta
- 2006
(Show Context)
Citation Context ...; some of them are functions of the damping factor α ∈ 0, 1), and we will use a notation reflecting this fact. Note that Q(α) is well defined for all α ∈ 0, 1), as (I − α P) is known to be invertible =-=[20]-=-. P = ¯D −1 ¯G A(α) = α P + (1 − α)1 T v C(α) = I − α P Q(α) = PC(α) −1 Figure 1: Basic PageRank definitions. The PageRank vector r(α) is defined as the dominant eigenvector of A(α); more precisely, a... |

33 | A Fast Two-stage Algorithm for Computing Page Rank
- Lee, H, et al.
- 2003
(Show Context)
Citation Context |

32 |
Hypersearching the web. Scientific American
- Chakrabarti, Dom, et al.
- 1999
(Show Context)
Citation Context ...ntroduction PageRank [17] is one of the most important ranking techniques used in today’s search engines. Not only is PageRank a simple, robust and reliable way to measure the importance of web pages =-=[3]-=-, but it is also computationally advantageous with respect to other ranking techniques in that it is query independent, and content independent. Otherwise said, it can be computed offline using only t... |

9 |
C.: Arnoldi-type Algorithms for Computing Stationary Distribution Vectors, with Application to PageRank
- Golub, Greif
(Show Context)
Citation Context |

6 |
T.Winograd, “The PageRank CitationRanking: Bringing Order to the WEB
- Page, Motwani
- 1998
(Show Context)
Citation Context ... the PageRank value obtained using a Maclau-rin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank. 1 Introduction PageRank =-=[17]-=- is one of the most important ranking techniques used in today's search engines. Notonly is PageRank a simple, robust and reliable way to measure the importance of web pages [3], but it is also comput... |

6 | A theoretical analysis of google’s PageRank - Pretto |

5 |
Authoritative sources in a hyperlinked environment. Journal of the ACM46(5
- Kleinberg
- 1999
(Show Context)
Citation Context ...erwise said, it can be computed offline using only the web graph1 structure and then used later, as users submit queries to the search engine, typically aggregatedwith other, query-dependent rankings =-=[4, 12, 16]-=-. One suggestive way to describe the idea behind PageRank is as follows: consider a random surferthat starts from a random page, and at every time chooses the next page by clicking on one of the links... |

3 | A theoretical approach to link analysis algorithms - Pretto - 2002 |

2 |
Ubicrawler: A scalablefully distributed web crawler
- Boldi, Codenotti, et al.
(Show Context)
Citation Context ...g/decreasing,unimodal concave/convex) to show that the approximation is excellent in all these cases. For this experiment we used a 41 291 594-nodes snapshot of the Italian web gathered by UbiCrawler =-=[1]-=- andindexed by WebGraph [2]. 6The coefficients are vectors, because we are approximating a vector function. 13s1e-07s1e-06s1e-05s0.0001s0.001s0.01s0.1s1s10s100s1000s0s10s20s30s40s50s60s70 01 23 4 Figu... |

1 |
The condition number of the PageRank problem.Technical Report 36, Stanford University Technical Report, June 2003. [9] Taher Haveliwala and Sepandar Kamvar. The second eigenvalue of the Google matrix
- Haveliwala, Kamvar
- 2003
(Show Context)
Citation Context ...graph-theoretic literature. Our choice avoids the usage of ambiguous terms that have been givendifferent meanings in different papers. 4All vectors in this paper are row vectors. 3sThis is Lemma 3 of =-=[8]-=-, albeit in the original statement of this lemma the factor 1 - ff is missing,probably due to an oversight. Note that (1) can be written as r(ff) = (1 - ff)v 1X t=0 (ff P)t , which makes the dependenc... |

1 |
A fast two-stage algorithm forcomputing PageRank and its extensions
- Lee, Golub, et al.
- 2004
(Show Context)
Citation Context ...oo close to 1. The Power Method convergesmore and more slowly [9] as ff ! 1-, a fact that also influences the other methods used to computePageRank (which are, after all, variants of the Power Method =-=[17, 7, 6, 15, 11, 10]-=-). Indeed, the number of iterations required could in general be bounded using the separation between the first andthe second eigenvalue, but unfortunately the separation can be abysmally small if ff ... |