Results 1 -
5 of
5
Twister: A runtime for iterative MapReduce
- In The First International Workshop on MapReduce and its Applications
, 2010
"... MapReduce programming model has simplified the implementation of many data parallel applications. The simplicity of the programming model and the quality of services provided by many implementations of MapReduce attract a lot of enthusiasm among distributed computing communities. From the years of e ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
MapReduce programming model has simplified the implementation of many data parallel applications. The simplicity of the programming model and the quality of services provided by many implementations of MapReduce attract a lot of enthusiasm among distributed computing communities. From the years of experience in applying MapReduce to various scientific applications we identified a set of extensions to the programming model and improvements to its architecture that will expand the applicability of MapReduce to more classes of applications. In this paper, we present the programming model and the architecture of Twister an enhanced MapReduce runtime that supports iterative MapReduce computations efficiently. We also show performance comparisons of Twister with other similar runtimes such as Hadoop and DryadLINQ for large scale data parallel applications.
MULTILEVEL ADAPTIVE AGGREGATION FOR MARKOV CHAINS, WITH APPLICATION TO WEB RANKING
"... Abstract. A multilevel adaptive aggregation method for calculating the stationary probability vector of an irreducible stochastic matrix is described. The method is a special case of the adaptive smooth aggregation and adaptive algebraic multigrid methods for sparse linear systems, and is also close ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
Abstract. A multilevel adaptive aggregation method for calculating the stationary probability vector of an irreducible stochastic matrix is described. The method is a special case of the adaptive smooth aggregation and adaptive algebraic multigrid methods for sparse linear systems, and is also closely related to certain extensively studied iterative aggregation/disaggregation methods for Markov chains. In contrast to most existing approaches, our aggregation process does not employ any explicit advance knowledge of the topology of the Markov chain. Instead, adaptive agglomeration is proposed that is based on strength of connection in a scaled problem matrix, in which the columns of the original problem matrix at each recursive fine level are scaled with the current probability vector iterate at that level. Strength of connection is determined as in the algebraic multigrid method, and the aggregation process is fully adaptive, with optimized aggregates chosen in each step of the iteration and at all recursive levels. The multilevel method is applied to a set of stochastic matrices that provide models for web page ranking. Numerical tests serve to illustrate for which types of stochastic matrices the multilevel adaptive method may provide significant speedup compared to standard iterative methods. The tests also provide more insight into why Google’s PageRank model is a successful model for determining a ranking of web pages.
Efficient parallel computation of PageRank
- In Proc. 28th ECIR
, 2006
"... Abstract. PageRank inherently is massively parallelizable and distributable, as a result of web’s strict host-based link locality. In this paper we show that the Gauß-Seidel iterative method for solving linear systems can be successfully applied in such a parallel ranking scenario in order to improv ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. PageRank inherently is massively parallelizable and distributable, as a result of web’s strict host-based link locality. In this paper we show that the Gauß-Seidel iterative method for solving linear systems can be successfully applied in such a parallel ranking scenario in order to improve convergence. By introducing a two-dimensional web model and by adapting the PageRank to this environment, we present and evaluate efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global information. 1
A survey on distributed approaches to graph based reputation measures
, 2007
"... Reputation systems are indispensable for the operation of Internet mediated services, electronic markets, document ranking systems, P2P networks and Ad Hoc networks. Here we survey available distributed approaches to the graph based reputation measures. Graph based reputation measures can be viewed ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Reputation systems are indispensable for the operation of Internet mediated services, electronic markets, document ranking systems, P2P networks and Ad Hoc networks. Here we survey available distributed approaches to the graph based reputation measures. Graph based reputation measures can be viewed as random walks on directed weighted graphs whose edges represent interactions among peers. We classify the distributed approaches to graph based reputation measures into three categories. The first category is based on asynchronous methods. The second category is based on the aggregation/decomposition methods. And the third category is based on the personalization methods which use local information.
Decentralized Network Analysis: a Proposal
"... In recent years, the peer-to-peer paradigm has gained momentum in several application areas: file-sharing and VoIP applications have been able to attract millions of end users, while large-scale distributed computing frameworks, including the Grid, have proven their ability of attacking large scient ..."
Abstract
- Add to MetaCart
In recent years, the peer-to-peer paradigm has gained momentum in several application areas: file-sharing and VoIP applications have been able to attract millions of end users, while large-scale distributed computing frameworks, including the Grid, have proven their ability of attacking large scientific problems. We believe, however, that the potential of the P2P approach has not been completely exploited yet. The goal of this position paper is to propose another scientific area where the P2P cooperation paradigm could be profitably adopted: network analysis, i.e. the mathematical characterization of the main graph-theoretic properties of a large-scale network. We discuss the potential issues that must be confronted with when a decentralized approach to network analysis is taken, and we propose a preliminary research plan. 1

