Results 1  10
of
123
FrogWild!–Fast PageRank Approximations on Graph Engines
 In NIPS Workshop on Distributed Machine Learning and Matrix Computations
, 2014
"... We propose FrogWild, a novel algorithm for fast approximation of high PageRank vertices, geared towards reducing network costs of running traditional PageRank algorithms. Our algorithm can be seen as a quantized version of power iteration that performs multiple parallel random walks over a directed ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We propose FrogWild, a novel algorithm for fast approximation of high PageRank vertices, geared towards reducing network costs of running traditional PageRank algorithms. Our algorithm can be seen as a quantized version of power iteration that performs multiple parallel random walks over a
Hadoop.TS: LargeScale TimeSeries Processing
, 2013
"... The paper describes a computational framework for timeseries analysis. It allows rapid prototyping of new algorithms, since all components are reusable. Generic data structures represent different types of time series, e. g. event and interevent time series, and define reliable interfaces to exist ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
to existing big data. Standalone applications, highly scalable MapReduce programs, and User Defined Functions for Hadoopbased analysis frameworks are the major modes of operation. Efficient implementations of univariate and bivariate analysis algorithms are provided for, e. g., longterm correlation
PowerGraph: Distributed GraphParallel Computation on Natural Graphs
"... Largescale graphstructured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graphparallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the realworld have highly ..."
Abstract

Cited by 117 (4 self)
 Add to MetaCart
Largescale graphstructured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graphparallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the realworld have highly skewed powerlaw degree distributions, which challenge the assumptions made by these abstractions, limiting performance and scalability. In this paper, we characterize the challenges of computation on natural graphs in the context of existing graphparallel abstractions. We then introduce the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges. Leveraging the PowerGraph abstraction we introduce a new approach to distributed graph placement and representation that exploits the structure of powerlaw graphs. We provide a detailed analysis and experimental evaluation comparing PowerGraph to two popular graphparallel systems. Finally, we describe three different implementation strategies for PowerGraph and discuss their relative merits with empirical evaluations on largescale realworld problems demonstrating order of magnitude gains. 1
Giraphx: Parallel Yet Serializable LargeScale Graph Processing
"... Abstract. Bulk Synchronous Parallelism (BSP) provides a good model for parallel processing of many largescale graph applications, however it is unsuitable/inefficient for graph applications that require coordination, such as graphcoloring, subcoloring, and clustering. To address this problem, we p ..."
Abstract
 Add to MetaCart
, Giraphx, provides much better performance than implementing the application using diningphilosophers over Giraph. In fact, Giraphx outperforms Giraph even for embarrassingly parallel applications that do not require coordination, e.g., PageRank. 1
An Experimental Comparison of Pregellike Graph Processing Systems∗
"... The introduction of Google’s Pregel generated much interest in the field of largescale graph data processing, inspiring the development of Pregellike systems such as Apache Giraph, GPS, Mizan, and GraphLab, all of which have appeared in the past two years. To gain an understanding of how Pregel ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Pregellike systems perform, we conduct a study to experimentally compare Giraph, GPS, Mizan, and GraphLab on equal ground by considering graph and algorithm agnostic optimizations and by using several metrics. The systems are compared with four different algorithms (PageRank, single source shortest
GoFFish: A SubGraph Centric Framework for
"... Abstract. Vertex centric models for large scale graph processing are gaining traction due to their simple distributed programming abstraction. However, pure vertex centric algorithms underperform due to large communication overheads and slow iterative convergence. We introduce GoFFish a scalable su ..."
Abstract
 Add to MetaCart
subgraph centric framework codesigned with a distributed persistent graph storage for large scale graph analytics on commodity clusters, offering the added natural flexibility of shared memory subgraph computation. We map Connected Components, SSSP and PageRank algorithms to this model
GPS: A Graph Processing System ∗
"... GPS (for Graph Processing System) is a complete opensource system we developed for scalable, faulttolerant, and easytoprogram execution of algorithms on extremely large graphs. GPS is similar to Google’s proprietary Pregel system [MAB+ 11], with some useful additional functionality described in ..."
Abstract

Cited by 63 (3 self)
 Add to MetaCart
partitioning schemes. We present our experiments on the performance of GPS under different static partitioning schemes—assigning vertices to workers “intelligently ” before the computation starts—and with GPS’s dynamic repartitioning feature, which reassigns vertices to different compute nodes during
Raghavan Raman Oracle Labs
"... Largescale graph analysis has recently been drawing lots of attention from both industry and academia. Although there are already several frameworks designed for scalable graph analysis, e.g. Giraph [1], all these frameworks adopt nontraditional programming models and APIs. This can significantl ..."
Abstract
 Add to MetaCart
Largescale graph analysis has recently been drawing lots of attention from both industry and academia. Although there are already several frameworks designed for scalable graph analysis, e.g. Giraph [1], all these frameworks adopt nontraditional programming models and APIs. This can
From Machu Picchu to “rafting the urubamba river”: Anticipating information needs via the EntityQuery Graph
 In Proc. WSDM, 2013
"... We study the problem of anticipating user search needs, based on their browsing activity. Given the current web page p that a user is visiting we want to recommend a small and diverse set of search queries that are relevant to the content of p, but also nonobvious and serendipitous. We introduce a ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
query suggestions for these entities, we exploit a novel graph model that we call EQGraph (EntityQuery Graph), containing entities, queries, and transitions between entities, between queries, as well as from entities to queries. We perform Personalized PageRank computation on such a graph to expand
MOCgraph: Scalable Distributed Graph Processing Using Message Online Computing
"... Existing distributed graph processing frameworks, e.g., Pregel, Giraph, GPS and GraphLab, mainly exploit main memory to support flexible graph operations for efficiency. Due to the complexity of graph analytics, huge memory space is required especially for those graph analytics that spawn large int ..."
Abstract
 Add to MetaCart
port. We implement MOCgraph on top of Apache Giraph, and test it against several representative graph algorithms on large graph datasets. Experiments illustrate that MOCgraph is efficient and memorysaving, especially for graph analytics with large intermediate results. 1.
Results 1  10
of
123