Results 21  30
of
459
C.: Imapreduce: a distributed computing framework for iterative computation
 In: Proceedings of the 1st International Workshop on Data Intensive Computing in the Clouds (DataCloud
, 2011
"... Abstract—Relational data are pervasive in many applications such as data mining or social network analysis. These relational data are typically massive containing at least millions or hundreds of millions of relations. This poses demand for the design of distributed computing frameworks for processi ..."
Abstract

Cited by 35 (11 self)
 Add to MetaCart
(Show Context)
Abstract—Relational data are pervasive in many applications such as data mining or social network analysis. These relational data are typically massive containing at least millions or hundreds of millions of relations. This poses demand for the design of distributed computing frameworks for processing these data on a large cluster. MapReduce is an example of such a framework. However, many relational data based applications typically require parsing the relational data iteratively and need to operate on these data through many iterations. MapReduce lacks builtin support for the iterative process. This paper presents iMapReduce, a framework that supports iterative processing. iMapReduce allows users to specify the iterative operations with map and reduce functions, while supporting the iterative processing automatically without the need of users ’ involvement. More importantly, iMapReduce significantly improves the performance of iterative algorithms by (1) reducing the overhead of creating a new task in every iteration, (2) eliminating the shuffling of the static data in the shuffle stage of MapReduce, and (3) allowing asynchronous execution of each iteration, i.e., an iteration can start before all tasks of a previous iteration have finished. We implement iMapReduce based on Apache Hadoop, and show that iMapReduce can achieve a factor of 1.2 to 5 speedup over those implemented on MapReduce for wellknown iterative algorithms. I.
Parallel breadthfirst search on distributed memory systems
, 2011
"... Dataintensive, graphbased computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for BreadthFirst Search (BFS), a key subroutine in several ..."
Abstract

Cited by 34 (9 self)
 Add to MetaCart
(Show Context)
Dataintensive, graphbased computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for BreadthFirst Search (BFS), a key subroutine in several graph algorithms. We present two highlytuned parallel approaches for BFS on large parallel systems: a levelsynchronous strategy that relies on a simple vertexbased partitioning of the graph, and a twodimensional sparse matrix partitioningbased approach that mitigates parallel communication overhead. For both approaches, we also present hybrid versions with intranode multithreading. Our novel hybrid twodimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execution regimes in which these approaches will be competitive, and we demonstrate extremely high performance on leading distributedmemory parallel systems. For instance, for a 40,000core parallel execution on Hopper, an AMD MagnyCours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution. 1.
A Distributed Graph Engine for Web Scale RDF Data
"... Much work has been devoted to supporting RDF data. But stateoftheart systems and methods still cannot handle web scale RDF data effectively. Furthermore, many useful and general purpose graphbased operations (e.g., random walk, reachability, community discovery) on RDF data are not supported, as ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
(Show Context)
Much work has been devoted to supporting RDF data. But stateoftheart systems and methods still cannot handle web scale RDF data effectively. Furthermore, many useful and general purpose graphbased operations (e.g., random walk, reachability, community discovery) on RDF data are not supported, as most existing systems store and index data in particular ways (e.g., as relational tables or as a bitmap matrix) to maximize one particular operation on RDF data: SPARQL query processing. In this paper, we introduce Trinity.RDF, a distributed, memorybased graph engine for web scale RDF data. Instead of managing the RDF data in triple stores or as bitmap matrices, we store RDF data in its native graph form. It achieves much better (sometimes orders of magnitude better) performance for SPARQL queries than the stateoftheart approaches. Furthermore, since the data is stored in its native graph form, the system can support other operations (e.g., random walks, reachability) on RDF graphs as well. We conduct comprehensive experimental studies on real life, web scale RDF data to demonstrate the effectiveness of our approach. 1
C.: Priter: a distributed framework for prioritized iterative computations
 In: Proceedings of the 2nd ACM Symposium on Cloud Computing (SOCC ’11
, 2011
"... Iterative computations are pervasive among data analysis applications in the cloud, including Web search, online social network analysis, recommendation systems, and so on. These cloud applications typically involve data sets of massive scale. Fast convergence of the iterative computation on the mas ..."
Abstract

Cited by 32 (9 self)
 Add to MetaCart
(Show Context)
Iterative computations are pervasive among data analysis applications in the cloud, including Web search, online social network analysis, recommendation systems, and so on. These cloud applications typically involve data sets of massive scale. Fast convergence of the iterative computation on the massive data set is essential for these applications. In this paper, we explore the opportunity for accelerating iterative computations and propose a distributed computing framework, PrIter, which enables fast iterative computation by providing the support of prioritized iteration. Instead of performing computations on all data records without discrimination, PrIter prioritizes the computations that help convergence the most, so that the convergence speed of iterative process is significantly improved. We evaluate PrIter on a local cluster of machines as well as on Amazon EC2 Cloud. The results show that PrIter achieves up to 50x speedup over Hadoop for a series of iterative algorithms.
Efficient parallel graph exploration for multicore cpu and gpu
 In IEEE PACT
, 2011
"... Abstract—Graphs are a fundamental data representation that have been used extensively in various domains. In graphbased applications, a systematic exploration of the graph such as a breadthfirst search (BFS) often serves as a key component in the processing of their massive data sets. In this pape ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Graphs are a fundamental data representation that have been used extensively in various domains. In graphbased applications, a systematic exploration of the graph such as a breadthfirst search (BFS) often serves as a key component in the processing of their massive data sets. In this paper, we present a new method for implementing the parallel BFS algorithm on multicore CPUs which exploits a fundamental property of randomly shaped realworld graph instances. By utilizing memory bandwidth more efficiently, our method shows improved performance over the current stateoftheart implementation and increases its advantage as the size of the graph increases. We then propose a hybrid method which, for each level of the BFS algorithm, dynamically chooses the best implementation from: a sequential execution, two different methods of multicore execution, and a GPU execution. Such a hybrid approach provides the best performance for each graph size while avoiding poor worstcase performance on highdiameter graphs. Finally, we study the effects of the underlying architecture on BFS performance by comparing multiple CPU and GPU systems; a highend GPU system performed as well as a quadsocket highend CPU system. I.
XStream: Edgecentric Graph Processing using Streaming Partitions
"... XStream is a system for processing both inmemory and outofcore graphs on a single sharedmemory machine. While retaining the scattergather programming model with state stored in the vertices, XStream is novel in (i) using an edgecentric rather than a vertexcentric implementation of this mod ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
(Show Context)
XStream is a system for processing both inmemory and outofcore graphs on a single sharedmemory machine. While retaining the scattergather programming model with state stored in the vertices, XStream is novel in (i) using an edgecentric rather than a vertexcentric implementation of this model, and (ii) streaming completely unordered edge lists rather than performing random access. This design is motivated by the fact that sequential bandwidth for all storage media (main memory, SSD, and magnetic disk) is substantially larger than random access bandwidth. We demonstrate that a large number of graph algorithms can be expressed using the edgecentric scattergather model. The resulting implementations scale well in terms of number of cores, in terms of number of I/O devices, and across different storage media. XStream competes favorably with existing systems for graph processing. Besides sequential access, we identify as one of the main contributors to better performance the fact that XStream does not need to sort edge lists during preprocessing. 1
Efficient subgraph matching on billion node graphs
 In PVLDB
, 2012
"... The ability to handle large scale graph data is crucial to an increasing number of applications. Much work has been dedicated to supporting basic graph operations such as subgraph matching, reachability, regular expression matching, etc. In many cases, graph indices are employed to speed up query pr ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
(Show Context)
The ability to handle large scale graph data is crucial to an increasing number of applications. Much work has been dedicated to supporting basic graph operations such as subgraph matching, reachability, regular expression matching, etc. In many cases, graph indices are employed to speed up query processing. Typically, most indices require either superlinear indexing time or superlinear indexing space. Unfortunately, for very large graphs, superlinear approaches are almost always infeasible. In this paper, we study the problem of subgraph matching on billionnode graphs. We present a novel algorithm that supports efficient subgraph matching for graphs deployed on a distributed memory store. Instead of relying on superlinear indices, we use efficient graph exploration and massive parallel computing for query processing. Our experimental results demonstrate the feasibility of performing subgraph matching on webscale graph data. 1.
Kineograph: taking the pulse of a fastchanging and connected world
 In Proceedings of the 7th ACM european conference on Computer Systems, EuroSys ’12
, 2012
"... Kineograph is a distributed system that takes a stream of incoming data to construct a continuously changing graph, which captures the relationships that exist in the data feed. As a computing platform, Kineograph further supports graphmining algorithms to extract timely insights from the fastchan ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
(Show Context)
Kineograph is a distributed system that takes a stream of incoming data to construct a continuously changing graph, which captures the relationships that exist in the data feed. As a computing platform, Kineograph further supports graphmining algorithms to extract timely insights from the fastchanging graph structure. To accommodate graphmining algorithms that assume a static underlying graph, Kineograph creates a series of consistent snapshots, using a novel and efficient epoch commit protocol. To keep up with continuous updates on the graph, Kineograph includes an incremental graphcomputation engine. We have developed three applications on top of Kineograph to analyze Twitter data: user ranking, approximate shortest paths, and controversial topic detection. For these applications, Kineograph takes a live Twitter data feed and maintains a graph of edges between all users and hashtags. Our evaluation shows that with 40 machines processing 100K tweets per second, Kineograph is able to continuously compute global properties, such as user ranks, with less than 2.5minute timeliness guarantees. This rate of traffic is more than 10 times the reported peak rate of Twitter as of October 2011.
Cloud computing and the DNA data race
 Nat. Biotechnol
, 2010
"... In the race between DNA sequencing throughput and computer speed, sequencing is winning by a mile. Sequencing throughput has recently been improving at a rate of about 5fold per year1, while computer performance generally follows “Moore's Law, ” doubling only every 18 or 24 months2. As this ga ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
In the race between DNA sequencing throughput and computer speed, sequencing is winning by a mile. Sequencing throughput has recently been improving at a rate of about 5fold per year1, while computer performance generally follows “Moore's Law, ” doubling only every 18 or 24 months2. As this gap widens, the question of how to design higherthroughput analysis pipelines becomes critical. If analysis throughput fails to turn the corner, research projects will continually stall until analyses catch up. How do we close the gap? One option is to invent algorithms that make better use of a fixed amount of computing power. Unfortunately, algorithmic breakthroughs of this kind, like scientific breakthroughs, are difficult to plan or foresee. A more practical option is to concentrate on developing methods that make better use of multiple computers and processers. When many computer processors work together in parallel, a software program can often finish in significantly less time. While parallel computing has existed for decades in various forms3–5, a recent manifestation called “cloud computing ” holds particular promise. Cloud computing is a model whereby users access compute resources from a vendor over the Internet1, such as from the
HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget
, 2011
"... The neighbourhood function N G.t / of a graph G gives, for each t 2 N, the number of pairs of nodes hx; yi such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph [PGF02] (e.g., it easily allows one to compute its diameter), bu ..."
Abstract

Cited by 28 (7 self)
 Add to MetaCart
(Show Context)
The neighbourhood function N G.t / of a graph G gives, for each t 2 N, the number of pairs of nodes hx; yi such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph [PGF02] (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm [PGF02] (approximate neighbourhood function) has been proposed with the purpose of approximating N G.t / on large graphs. We describe a breakthrough improvement over ANF in terms of speed and scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters [FFGM07] and combines them efficiently through broadword programming [Knu07]; our implementation uses task decomposition to exploit multicore parallelism. With HyperANF, for the first time we can compute in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. Then, we turn to the study of the distribution of distances between reachable nodes (that can be efficiently approximated by means of HyperANF), and discover the surprising fact that its index of dispersion provides a clearcut characterisation of proper social networks vs. web graphs. We thus propose the spid (ShortestPaths Index of Dispersion) of a graph as a new, informative statistics that is able to discriminate between the above two types of graphs. We believe this is the first proposal of a significant new nonlocal structural index for complex networks whose computation is highly scalable. 1