## PAGERANK COMPUTATION, WITH SPECIAL ATTENTION TO DANGLING NODES

Citations: | 7 - 0 self |

### BibTeX

@MISC{F_pagerankcomputation,,

author = {Ilse C. F and Ipsen and Teresa M. Selee},

title = {PAGERANK COMPUTATION, WITH SPECIAL ATTENTION TO DANGLING NODES},

year = {}

}

### OpenURL

### Abstract

Abstract. We present a simple algorithm for computing the PageRank (stationary distribution) of the stochastic Google matrix G. The algorithm lumps all dangling nodes into a single node. We express lumping as a similarity transformation of G, and show that the PageRank of the nondangling nodes can be computed separately from that of the dangling nodes. The algorithm applies the power method only to the smaller lumped matrix, but the convergence rate is the same as that of the power method applied to the full matrix G. The efficiency of the algorithm increases as the number of dangling nodes increases. We also extend the expression for PageRank and the algorithm to more general Google matrices that have several different dangling node vectors, when it is required to distinguish among different classes of dangling nodes. We also analyze the effect of the dangling node vector on the PageRank, and show that the PageRank of the dangling nodes depends strongly on that of the nondangling nodes but not vice versa. At last we present a Jordan decomposition of the Google matrix for the (theoretical) extreme case when all web pages are dangling nodes.

### Citations

3252 | The anatomy of a large-scale hypertextual web search engine
- Brin, Page
- 1998
(Show Context)
Citation Context ...on. 65F10, 65F50, 65C40, 15A06, 15A18, 15A21, 15A51, 68P20 1. Introduction. The order in which the search engine Google displays the web pages is determined, to a large extent, by the PageRank vector =-=[7, 33]-=-. The PageRank vector contains, for every web page, a ranking that reflects the importance of the web page. Mathematically, the PageRank vector π is the stationary distribution of the socalled Google ... |

2137 | The pagerank citation ranking: Bringing order to the web - Page, Brin, et al. - 1998 |

288 | Combating web spam with trustrank
- Gyöngyi, Garcia-Molina, et al.
- 2004
(Show Context)
Citation Context ...thms in [30, 32] are special cases of our algorithm because our algorithm allows the dangling node and personalization vectors to be different, and thereby facilitates the implementation of TrustRank =-=[18]-=-. TrustRank is designed to diminish the harm done by link spamming and was patented by Google in March 2005 [35]. Moreover, our algorithm can be extended to a more general Google matrix that contains ... |

144 | The indexable web is more than 11.5 billion pages”. In: Special interest tracks and posters
- Gulli, Signorini
(Show Context)
Citation Context ...cts the importance of the web page. Mathematically, the PageRank vector π is the stationary distribution of the socalled Google matrix, a sparse stochastic matrix whose dimension exceeds 11.5 billion =-=[16]-=-. The Google matrix G is a convex combination of two stochastic matrices G = αS + (1 − α)E, 0 ≤ α < 1, where the matrix S represents the link structure of the web, and the primary purpose of the rank-... |

133 | Extrapolation methods for accelerating pagerank computations
- Kamvar, Haveliwala, et al.
- 2003
(Show Context)
Citation Context ...of dangling nodes increases. Many other algorithms have been proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods =-=[5, 6, 20, 26, 25]-=-, and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method applied to the full Google matrix G, but retain... |

94 | Ranking the Web frontier
- Eiron, McCurley, et al.
- 2004
(Show Context)
Citation Context ...ing pages [11, §2]. The rows in the matrix S corresponding to dangling nodes would be zero, if left untreated. Several ideas have been proposed to deal with the zero rows and force S to be stochastic =-=[11]-=-. The most popular approach adds artificial links to the dangling nodes, by replacing zero rows in the matrix with the same vector, w, so that the matrix S is stochastic. It is natural as well as effi... |

66 | A survey on pagerank computing
- Berkhin
- 2005
(Show Context)
Citation Context ...uding classical iterative methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods [8, 22, 31] see also the survey papers =-=[2, 28]-=- and the book [29]. Our algorithm is faster than the power method applied to the full Google matrix G, but retains all the advantages of the power method: It is simple to implement and requires minima... |

35 |
Efficient PageRank approximation via graph aggregation
- Broder, Lempel, et al.
- 2004
(Show Context)
Citation Context ...n proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods =-=[8, 22, 31]-=- see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method applied to the full Google matrix G, but retains all the advantages of the power method: It is simp... |

34 | A Fast Two-stage Algorithm for Computing Page Rank
- Lee, H, et al.
- 2003
(Show Context)
Citation Context ...al as well as efficient to exclude the dangling nodes with their artificial links from the PageRank computation. This can be done, for instance, by ’lumping’ all the dangling nodes into a single node =-=[32]-=-. In §3 we provide a rigorous justification for lumping the dangling nodes in the Google matrix G, by expressing lumping as a similarity transformation of G (Theorem 3.1). We show that the PageRank of... |

23 | Fast parallel PageRank: A linear system approach. Yahoo! Research
- Gleich, Zhukov, et al.
- 2004
(Show Context)
Citation Context ...gorithm increases as the number of dangling nodes increases. Many other algorithms have been proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods =-=[13, 14]-=-, extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method appli... |

20 | Fast PageRank Computation via a Sparse Linear
- Corso, Gullí, et al.
(Show Context)
Citation Context ...only from v2, and from the PageRank of the nondangling nodes, filtered through the links H12. An expression for π when dangling node and personalization vectors are the same, i.e. w = v, was given in =-=[10]-=-, w1 H12 π T � = (1 − α) 1 + αvT R−1d 1 − αvT R−1 � v d T R −1 , where R ≡ I − αH. In this case the PageRank vector π is a multiple of the vector v T (I − αH) −1 . 5. Only Dangling Nodes. We examine t... |

14 | Computing pagerank using power extrapolation
- Haveliwala, Kamvar, et al.
- 2003
(Show Context)
Citation Context ...of dangling nodes increases. Many other algorithms have been proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods =-=[5, 6, 20, 26, 25]-=-, and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method applied to the full Google matrix G, but retain... |

13 |
canonical form of the Google matrix: a potential contribution to the PageRank computation
- Serra-Capizzano, Jordan
(Show Context)
Citation Context ...ination G ≡ αS + (1 − α)ev T , 0 ≤ α < 1. Although the stochastic matrix G may not be primitive or irreducible, its eigenvalue 1 is distinct and the magnitude of all other eigenvalues is bounded by α =-=[12, 19, 25, 26, 34]-=-. Therefore G has a unique stationary distribution, π T G = π T , π ≥ 0, �π� = 1. The stationary distribution π is called PageRank. Element i of π represents the PageRank for web page i. If we partiti... |

11 |
Convergence analysis of a PageRank updating algorithm by Langville and Meyer
- Ipsen, Kirkland
(Show Context)
Citation Context ...n proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods =-=[8, 22, 31]-=- see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method applied to the full Google matrix G, but retains all the advantages of the power method: It is simp... |

8 |
The eigenvalues of the Google matrix
- Eldén
- 2003
(Show Context)
Citation Context ...ination G ≡ αS + (1 − α)ev T , 0 ≤ α < 1. Although the stochastic matrix G may not be primitive or irreducible, its eigenvalue 1 is distinct and the magnitude of all other eigenvalues is bounded by α =-=[12, 19, 25, 26, 34]-=-. Therefore G has a unique stationary distribution, π T G = π T , π ≥ 0, �π� = 1. The stationary distribution π is called PageRank. Element i of π represents the PageRank for web page i. If we partiti... |

8 |
Markov property for a function of a Markov chain: a linear algebra approach
- GURVITS, LEDOUX
- 2005
(Show Context)
Citation Context ...matrix represents a lumpable Markov chain. The concept of lumping was originally introduced for general Markov matrices, to speed up the computation of the stationary distribution or to obtain bounds =-=[9, 17, 24, 27]-=-. Below we paraphrase lumpability [27, Theorem 6.3.2] in matrix terms: Let P be a permutation matrix and ⎡ ⎤ PMP T = ⎢ ⎣ M11 . . . M1,k+1 . . . . Mk+1,1 . . . Mk+1,k+1 a partition of a stochastic matr... |

7 |
An Arnoldi-type algorithm for computing PageRank
- GOLUB, GREIF
- 2006
(Show Context)
Citation Context ...gorithm increases as the number of dangling nodes increases. Many other algorithms have been proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods =-=[13, 14]-=-, extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method appli... |

6 |
The PageRank vector: Properties, computation, approximation, and acceleration
- Brezinski, Redivo-Zaglia
(Show Context)
Citation Context ...of dangling nodes increases. Many other algorithms have been proposed for computing PageRank, including classical iterative methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods =-=[5, 6, 20, 26, 25]-=-, and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book [29]. Our algorithm is faster than the power method applied to the full Google matrix G, but retain... |

6 | Mathematical properties and analysis of google’s pagerank. Boletín de la Sociedad Espa`nola de Matemática Aplicada
- Ipsen, Wills
- 2006
(Show Context)
Citation Context ...iply with the k × k matrix H11 as well as several vector operations. Thus the dangling nodes are excluded from the power method computation. The convergence rate of the power method applied to G is α =-=[23]-=-. Algorithm 3.1 has the same convergence rate, because G (1) has the same nonzero eigenvalues as G, see Theorem 3.1, but is much faster because it operates on a smaller matrix whose dimension does not... |

6 |
Adaptive methods for the computation
- Kamvar, Haveliwala, et al.
(Show Context)
Citation Context |

5 |
Quasi Lumpability, Lower-bounding coupling Matrices, and Nearly Completely Decomposable Markov Chains
- Dayar, Stewart
- 1997
(Show Context)
Citation Context ...matrix represents a lumpable Markov chain. The concept of lumping was originally introduced for general Markov matrices, to speed up the computation of the stationary distribution or to obtain bounds =-=[9, 17, 24, 27]-=-. Below we paraphrase lumpability [27, Theorem 6.3.2] in matrix terms: Let P be a permutation matrix and ⎡ ⎤ PMP T = ⎢ ⎣ M11 . . . M1,k+1 . . . . Mk+1,1 . . . Mk+1,k+1 a partition of a stochastic matr... |

5 |
The second eigenvalue of the Google matrix,” tech
- Haveliwala, Kamvar
- 2003
(Show Context)
Citation Context ...ination G ≡ αS + (1 − α)ev T , 0 ≤ α < 1. Although the stochastic matrix G may not be primitive or irreducible, its eigenvalue 1 is distinct and the magnitude of all other eigenvalues is bounded by α =-=[12, 19, 25, 26, 34]-=-. Therefore G has a unique stationary distribution, π T G = π T , π ≥ 0, �π� = 1. The stationary distribution π is called PageRank. Element i of π represents the PageRank for web page i. If we partiti... |

4 |
Pagerank and Beyond- The Science of Search Engine Rankings -Amy
- Google's
(Show Context)
Citation Context ...tive methods [1, 4, 30], Krylov subspace methods [13, 14], extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book =-=[29]-=-. Our algorithm is faster than the power method applied to the full Google matrix G, but retains all the advantages of the power method: It is simple to implement and requires minimal storage. Unlike ... |

3 |
2002, Testing lumpability
- Baran, Jernigan
(Show Context)
Citation Context ...matrix represents a lumpable Markov chain. The concept of lumping was originally introduced for general Markov matrices, to speed up the computation of the stationary distribution or to obtain bounds =-=[9, 17, 24, 27]-=-. Below we paraphrase lumpability [27, Theorem 6.3.2] in matrix terms: Let P be a permutation matrix and ⎡ ⎤ PMP T = ⎢ ⎣ M11 . . . M1,k+1 . . . . Mk+1,1 . . . Mk+1,k+1 a partition of a stochastic matr... |

2 |
amd Tomlin, PageRank computation and the structure of the web: Experiments and algorithms
- Arasu, Novak, et al.
- 2002
(Show Context)
Citation Context ...ler matrix. The efficiency of the algorithm increases as the number of dangling nodes increases. Many other algorithms have been proposed for computing PageRank, including classical iterative methods =-=[1, 4, 30]-=-, Krylov subspace methods [13, 14], extrapolation methods [5, 6, 20, 26, 25], and aggregation/disaggregation methods [8, 22, 31] see also the survey papers [2, 28] and the book [29]. Our algorithm is ... |

2 |
Extrapolation Methods for PageRank Computations,” Comptes Rendus de l’Académie des Sciences de
- Brezinski, Redivo-Zaglia, et al.
- 2005
(Show Context)
Citation Context |

2 | Canonical and standard forms for certain rank one perturbations and an application to the (complex) Google PageRanking problem - Horn, Serra-Capizzano - 2006 |

2 | Google’s PageRank: The math behind the search engine
- Wills
- 2006
(Show Context)
Citation Context ...zation vectors to be different, and thereby facilitates the implementation of TrustRank [18]. TrustRank is designed to diminish the harm done by link spamming and was patented by Google in March 2005 =-=[35]-=-. Moreover, our algorithm can be extended to a more general Google matrix that contains several different dangling node vectors (§3.4). In §4 we examine how the PageRanks of the dangling and nondangli... |