## Efficient parallel computation of PageRank (2006)

Venue: | In Proc. 28th ECIR |

Citations: | 9 - 1 self |

### BibTeX

@INPROCEEDINGS{Kohlschütter06efficientparallel,

author = {Christian Kohlschütter and Ru Chirita and Wolfgang Nejdl},

title = {Efficient parallel computation of PageRank},

booktitle = {In Proc. 28th ECIR},

year = {2006},

pages = {241--252}

}

### OpenURL

### Abstract

Abstract. PageRank inherently is massively parallelizable and distributable, as a result of web’s strict host-based link locality. In this paper we show that the Gauß-Seidel iterative method for solving linear systems can be successfully applied in such a parallel ranking scenario in order to improve convergence. By introducing a two-dimensional web model and by adapting the PageRank to this environment, we present and evaluate efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global information. 1

### Citations

2144 | The PageRank Citation Ranking: Bringing Order to the Web
- Page, Brin, et al.
- 1998
(Show Context)
Citation Context ...a few pages in response to her query, some of them being much more relevant than others. This gave birth to a lot of ordering (or ranking) research, the most popular algorithm being Google’s PageRank =-=[20]-=-, which recursively determines the importance of a web page by the importance of all the pages pointing to it. Although improvements for a centralized computation of PageRank have been researched in d... |

216 | H.: The evolution of the web and implications for an incremental crawler
- Cho, Garcia-Molina
- 2000
(Show Context)
Citation Context ... in a two-dimensional fashion (with the URL’s hostname as the primary criterion), thus separating it into reasonably disjunct partitions, which are then used for distributed, incremental web crawling =-=[6]-=- and PageRank computation. The remainder of the paper is organized as follows. After reviewing the PageRank algorithm, common web graph representation techniques and existing parallel versions of Page... |

144 | C.D.: “Deeper Inside PageRank
- Langville, Meyer
(Show Context)
Citation Context ...matrix is sparse), this results in much lower memory requirements of the link structure, in the magnitude of | L |·n −1 · c bytes (n = average outdegree; c = const.) Of course, compression techniques =-=[15]-=- or intelligent approaches to disk-based “swapping” [9, 5, 18] can improve the space requirements even further (e.g. by relying on a particular data order, or on the presence of caches). But with the ... |

133 | Extrapolation methods for accelerating pagerank computations
- Kamvar, Haveliwala, et al.
- 2003
(Show Context)
Citation Context ...h recursively determines the importance of a web page by the importance of all the pages pointing to it. Although improvements for a centralized computation of PageRank have been researched in detail =-=[1,5,9,11,12,16,13,18]-=-, approaches on distributing it over several computers have caught researchers’ attention only recently. In this paper we introduce a new approach to computing the exact PageRank in a parallel fashion... |

132 | Efficient Computation of PageRank
- Haveliwala
- 1999
(Show Context)
Citation Context ...h recursively determines the importance of a web page by the importance of all the pages pointing to it. Although improvements for a centralized computation of PageRank have been researched in detail =-=[1,5,9,11,12,16,13,18]-=-, approaches on distributing it over several computers have caught researchers’ attention only recently. In this paper we introduce a new approach to computing the exact PageRank in a parallel fashion... |

129 | Exploiting the block structure of the web for computing PageRank
- Kamvar, Haveliwala, et al.
- 2003
(Show Context)
Citation Context ...es or directories, which is orders of magnitudes faster. The inner structure of these formations (at page level) can then be computed in an independently parallel manner (“off-line”), as in BlockRank =-=[10]-=-, SiteRank [25], the U-Model [4], ServerRank [24] or HostRank/DirRank [7]. We will try to take the best out of both approaches: the exactness of a straight PageRank computation but the speed of an app... |

94 | Ranking the Web Frontier
- Eiron, McCurley
(Show Context)
Citation Context ...re of these formations (at page level) can then be computed in an independently parallel manner (“off-line”), as in BlockRank [10], SiteRank [25], the U-Model [4], ServerRank [24] or HostRank/DirRank =-=[7]-=-. We will try to take the best out of both approaches: the exactness of a straight PageRank computation but the speed of an approximation, without any centralized re-ranking.s244 Chr. Kohlschütter, P.... |

77 | Who links to whom: Mining linkage between web sites
- Bharat, Henzinger, et al.
- 2001
(Show Context)
Citation Context ...putation but the speed of an approximation, without any centralized re-ranking.s244 Chr. Kohlschütter, P.-A. Chirita, and W. Nejdl 3 The Two-Dimensional Web 3.1 Host-Based Link Locality Bharat et al. =-=[2]-=- have shown that there are two different types of web links dominating the web structure, “intra-site” links and “inter-site” ones. A “site” can be a domain (.yahoo.com), a host (geocities.yahoo.com) ... |

55 | Pagerank computation and the structure of the web: Experiments and algorithms
- Arasu, Novak, et al.
- 2002
(Show Context)
Citation Context ...h recursively determines the importance of a web page by the importance of all the pages pointing to it. Although improvements for a centralized computation of PageRank have been researched in detail =-=[1,5,9,11,12,16,13,18]-=-, approaches on distributing it over several computers have caught researchers’ attention only recently. In this paper we introduce a new approach to computing the exact PageRank in a parallel fashion... |

48 | Adaptive methods for the computation of pagerank
- Kamvar, Haveliwala, et al.
- 2003
(Show Context)
Citation Context |

42 | Computing pagerank in a distributed internet search system
- Wang, DeWitt
- 2004
(Show Context)
Citation Context ...faster. The inner structure of these formations (at page level) can then be computed in an independently parallel manner (“off-line”), as in BlockRank [10], SiteRank [25], the U-Model [4], ServerRank =-=[24]-=- or HostRank/DirRank [7]. We will try to take the best out of both approaches: the exactness of a straight PageRank computation but the speed of an approximation, without any centralized re-ranking.s2... |

37 | Distributed pagerank for p2p systems
- Sankaralingam, Sethumadhavan, et al.
- 2003
(Show Context)
Citation Context ...k parallelization can be divided into two classes: Exact Computations and Approximations. Parallel Computations. In this scenario, the web graph is initially partitioned into blocks: grouped randomly =-=[21]-=-, lexicographically sorted by page [17, 22, 26] or balanced according to the number of links [8]. Then, standard iterative methods such as Jacobi (Equation 1) or Krylov subspace [8] are performed over... |

35 |
Efficient PageRank approximation via graph aggregation
- Broder, Lempel, et al.
- 2004
(Show Context)
Citation Context ...s of magnitudes faster. The inner structure of these formations (at page level) can then be computed in an independently parallel manner (“off-line”), as in BlockRank [10], SiteRank [25], the U-Model =-=[4]-=-, ServerRank [24] or HostRank/DirRank [7]. We will try to take the best out of both approaches: the exactness of a straight PageRank computation but the speed of an approximation, without any centrali... |

34 | A Fast Two-stage Algorithm for Computing Page Rank
- Lee, H, et al.
- 2003
(Show Context)
Citation Context |

31 | A uniform approach to accelerated pagerank computation
- MCSHERRY
- 2005
(Show Context)
Citation Context |

23 | Fast Parallel PageRank: A Linear System Approach
- Gleich, Zhukov, et al.
- 2004
(Show Context)
Citation Context ...l Computations. In this scenario, the web graph is initially partitioned into blocks: grouped randomly [21], lexicographically sorted by page [17, 22, 26] or balanced according to the number of links =-=[8]-=-. Then, standard iterative methods such as Jacobi (Equation 1) or Krylov subspace [8] are performed over these pieces in parallel. The partitions periodically must exchange information: Depending on t... |

16 | Distributed page ranking in structured p2p networks
- Shi, Yu, et al.
- 2003
(Show Context)
Citation Context ...wo classes: Exact Computations and Approximations. Parallel Computations. In this scenario, the web graph is initially partitioned into blocks: grouped randomly [21], lexicographically sorted by page =-=[17, 22, 26]-=- or balanced according to the number of links [8]. Then, standard iterative methods such as Jacobi (Equation 1) or Krylov subspace [8] are performed over these pieces in parallel. The partitions perio... |

12 | A parallel gauss-seidel algorithm for sparse power system matrices
- Koester, Ranka, et al.
- 1994
(Show Context)
Citation Context ...ter than Jacobi but was said not to be efficiently parallelizable here [1, 5, 18]. Actually, there already are parallel Gauss-Seidel implementations for certain scenarios such as the one described in =-=[14]-=-, using block-diagonally-bordered matrices; however, they all admit their approach was designed for a static matrix; after each modification, a specific preprocessing (sorting) step is required, which... |

10 |
Technology Surveys
- Web
- 2013
(Show Context)
Citation Context ...ain a searchable index over all retrieved web pages. Its size is currently reaching several billions of pages from about 54 million publicly accessible hosts, but these amounts are rapidly increasing =-=[19]-=-. When searching such huge datasets, one would usually receive quite a few pages in response to her query, some of them being much more relevant than others. This gave birth to a lot of ordering (or r... |

8 | Distributed PageRank computation based on iterative aggregation-disaggregation methods," presented at the
- Zhu, Ye, et al.
- 2005
(Show Context)
Citation Context ...wo classes: Exact Computations and Approximations. Parallel Computations. In this scenario, the web graph is initially partitioned into blocks: grouped randomly [21], lexicographically sorted by page =-=[17, 22, 26]-=- or balanced according to the number of links [8]. Then, standard iterative methods such as Jacobi (Equation 1) or Krylov subspace [8] are performed over these pieces in parallel. The partitions perio... |

7 | An Improved Computation of the PageRank Algorithm
- Kim, Lee
- 2002
(Show Context)
Citation Context |

5 |
Torsten Suel. I/O efficient techniques for computing pagerank
- Chen, Gan
- 2002
(Show Context)
Citation Context |

5 | Using SiteRank for P2P Web retrieval
- Wu, Aberer
- 2004
(Show Context)
Citation Context ...es, which is orders of magnitudes faster. The inner structure of these formations (at page level) can then be computed in an independently parallel manner (“off-line”), as in BlockRank [10], SiteRank =-=[25]-=-, the U-Model [4], ServerRank [24] or HostRank/DirRank [7]. We will try to take the best out of both approaches: the exactness of a straight PageRank computation but the speed of an approximation, wit... |

1 |
Haveliwala et al. 2001 Crawl of the WebBase project
- Taher
- 2001
(Show Context)
Citation Context ...s are intra-host and 95.2% intra-domain [10]. This assumed block structure has been visualized by Kamvar et al. [10] using dotplots of small parts (domain-level) of the ”LargeWeb” graph’s link matrix =-=[23]-=-. In these plots, the point (i,j) is black, if there is a link from page pi to pj, clear otherwise. We performed such a plot under the same setting, but on whole-graph scale. The outcome is interestin... |