Applying parallel computation algorithms in the design of serial algorithms
 J. ACM
, 1983
Abstract. The goal of this paper is to point out that analyses of parallelism in computational problems have practical implications even when multiprocessor machines are not available. This is true because, in many cases, a good parallel algorithm for one problem may turn out to be useful for designing an efficient serial algorithm for another problem. A d ~ eframework d for cases like this is presented. Particular cases, which are discussed in this paper, provide motivation for examining parallelism in sorting, selection, minimumspanningtree, shortest route, maxflow, and matrix multiplication problems, as well as in scheduling and locational problems.
Programming Parallel Algorithms
, 1996
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
The NPcompleteness column: an ongoing guide
 Journal of Algorithms
, 1985
This is the nineteenth edition of a (usually) quarterly column that covers new developments in the theory of NPcompleteness. The presentation is modeled on that used by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NPCompleteness,’ ’ W. H. Freeman & Co., New York, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed, and, when appropriate, crossreferences will be given to that book and the list of problems (NPcomplete and harder) presented there. Readers who have results they would like mentioned (NPhardness, PSPACEhardness, polynomialtimesolvability, etc.) or open problems they would like publicized, should
Wavecluster: A multiresolution clustering approach for very large spatial databases
, 1998
Many applications require the management of spatial data. Clustering large spatial databases is an important problem which tries to find the densely populated regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. A good clustering approach should be efficient and detect clusters of arbitrary shape. It must be insensitive to the outliers (noise) and the order of input data. We propose WaveCluster, a novel clustering approach based on wavelet transforms, which satisfies all the above requirements. Using multiresolution property of wavelet transforms, we can effectively identify arbitrary shape clusters at different degrees of accuracy. We also demonstrate that WaveCluster is highly efficient in terms of time complexity. Experimental results on very large data sets are presented which show the efficiency and effectiveness of the proposed approach compared to the other recent clustering methods.
A new approach to the minimum cut problem
 Journal of the ACM
, 1996
Abstract. This paper presents a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph’s minimum cut form an extremely small fraction of the graph’s edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability. The algorithm runs in O(n 2 log 3 n) time, a significant improvement over the previous Õ(mn) time bounds based on maximum flows. It is simple and intuitive and uses no complex data structures. Our algorithm can be parallelized to run in �� � with n 2 processors; this gives the first proof that the minimum cut problem can be solved in ���. The algorithm does more than find a single minimum cut; it finds all of them. With minor modifications, our algorithm solves two other problems of interest. Our algorithm finds all cuts with value within a multiplicative factor of � of the minimum cut’s in expected Õ(n 2 � ) time, or in �� � with n 2 � processors. The problem of finding a minimum multiway cut of a graph into r pieces is solved in expected Õ(n 2(r�1) ) time, or in �� � with n 2(r�1) processors. The “trace ” of the algorithm’s execution on these two problems forms a new compact data structure for representing all small cuts and all multiway cuts in a graph. This data structure can be efficiently transformed into the
The Power of Reconfiguration
, 1998
This paper concerns the computational aspects of the reconfigurable network model. The computational power of the model is investigated under several network topologies and assuming several variants of the model. In particular, it is shown that there are reconfigurable machines based on simple network topologies, that are capable of solving large classes of problems in constant time. These classes depend on the kinds of switches assumed for the network nodes. Reconfigurable networks are also compared with various other models of parallel computation, like PRAM's and Branching Programs. Part of this work is to be presented at the 18th International Colloquium on Automata, Languages, and Programming (ICALP), July 1991, Madrid. y Department of Computer Science, The Hebrew University, Jerusalem 91904, Israel. Email: yosi@humus.huji.ac.il, Supported by Eshcol Fellowship. z Department of Applied Mathematics and Computer Science, The Weizmann Institute, Rehovot 76100, Israel. Email: p...
PEGASUS: A PetaScale Graph Mining System Implementation and Observations
 IEEE INTERNATIONAL CONFERENCE ON DATA MINING
, 2009
Abstract—In this paper, we describe PEGASUS, an open source Peta Graph Mining library which performs typical graph mining tasks such as computing the diameter of the graph, computing the radius of each node and finding the connected components. As the size of graphs reaches several Giga, Tera or Petabytes, the necessity for such a library grows too. To the best of our knowledge, PEGASUS is the first such library, implemented on the top of the HADOOP platform, the open source version of MAPREDUCE. Many graph mining operations (PageRank, spectral clustering, diameter estimation, connected components etc.) are essentially a repeated matrixvector multiplication. In this paper we describe a very important primitive for PEGASUS, called GIMV (Generalized Iterated MatrixVector multiplication). GIMV is highly optimized, achieving (a) good scaleup on the number of available machines (b) linear running time on the number of edges, and (c) more than 5 times faster performance over the nonoptimized version of GIMV. Our experiments ran on M45, one of the top 50 supercomputers in the world. We report our findings on several real graphs, including one of the largest publicly available Web Graphs, thanks to Yahoo!, with ≈ 6,7 billion edges. KeywordsPEGASUS; graph mining; hadoop I.
Efficient parallel graph algorithms for coarse grained multicomputers and BSP (Extended Abstract)
 in Proc. 24th International Colloquium on Automata, Languages and Programming (ICALP'97
, 1997
In this paper, we present deterministic parallel algorithms for the coarse grained multicomputer (CGM) and bulksynchronous parallel computer (BSP) models which solve the following well known graph problems: (1) list ranking, (2) Euler tour construction, (3) computing the connected components and spanning forest, (4) lowest common ancestor preprocessing, (5) tree contraction and expression tree evaluation, (6) computing an ear decomposition or open ear decomposition, (7) 2edge connectivity and biconnectivity (testing and component computation), and (8) cordal graph recognition (finding a perfect elimination ordering). The algorithms for Problems 17 require O(log p) communication rounds and linear sequential work per round. Our results for Problems 1 and 2, i.e.they are fully scalable, and for Problems hold for arbitrary ratios n p 38 it is assumed that n p,>0, which is true for all commercially
Improved Algorithms For Bipartite Network Flow
, 1994
In this paper, we study network flow algorithms for bipartite networks. A network G = (V; E) is called bipartite if its vertex set V can be partitioned into two subsets V 1 and V 2 such that all edges have one endpoint in V 1 and the other in V 2 . Let n = jV j, n 1 = jV 1 j, n 2 = jV 2 j, m = jEj and assume without loss of generality that n 1 n 2 . We call a bipartite network unbalanced if n 1 ø n 2 and balanced otherwise. (This notion is necessarily imprecise.) We show that several maximum flow algorithms can be substantially sped up when applied to unbalanced networks. The basic idea in these improvements is a twoedge push rule that allows us to "charge" most computation to vertices in V 1 , and hence develop algorithms whose running times depend on n 1 rather than n. For example, we show that the twoedge push version of Goldberg and Tarjan's FIFO preflow push algorithm runs in O(n 1 m + n 3 1 ) time and that the analogous version of Ahuja and Orlin's excess scaling algori...
Parallel Ear Decomposition Search (EDS) And STNumbering In Graphs
, 1986
[LEC67] linear time serial algorithm for testing planarity of graphs uses the linear time serial algorithm of [ET76] for stnumbering. This stnumbering algorithm is based on depthfirst search (DFS). A known conjecture states that DFS, which is a key technique in designing serial algorithms, is not amenable to polylog time parallelism using "around linearly" (or even polynomially) many processors. The first contribution of this paper is a general method for searching efficiently in parallel undirected graphs, called eardecomposition search (EDS). The second contribution demonstrates the applicability of this search method. We present an efficient parallel algorithm for stnumbering in a biconnected graph. The algorithm runs in logarithmic time using a linear number of processors on a concurrentread concurrentwrite (CRCW) PRAM. An efficient parallel algorithm for the problem did not exist before. The problem was not even known to be in NC. 1. Introduction We define the problems ...