Results 1  10
of
14
Residual splash for optimally parallelizing belief propagation
 In In Artificial Intelligence and Statistics (AISTATS
, 2009
"... As computer architectures move towards multicore we must build a theoretical understanding of parallelism in machine learning. In this paper we focus on parallel inference in graphical models. We demonstrate that the natural, fully synchronous parallelization of belief propagation is highly ineffici ..."
Abstract

Cited by 68 (8 self)
 Add to MetaCart
(Show Context)
As computer architectures move towards multicore we must build a theoretical understanding of parallelism in machine learning. In this paper we focus on parallel inference in graphical models. We demonstrate that the natural, fully synchronous parallelization of belief propagation is highly inefficient. By bounding the achievable parallel performance in chain graphical models we develop a theoretical understanding of the parallel limitations of belief propagation. We then provide a new parallel belief propagation algorithm which achieves optimal performance. Using two challenging realworld tasks, we empirically evaluate the performance of our algorithm on large cyclic graphical models where we achieve near linear parallel scaling and out perform alternative algorithms. 1
An introduction and survey of estimation of distribution algorithms
 SWARM AND EVOLUTIONARY COMPUTATION
, 2011
"... ..."
(Show Context)
Inference of Beliefs on BillionScale Graphs
"... How do we scale up the inference of graphical models to billions of nodes and edges? How do we, or can we even, implement an inference algorithm for graphs that do not fit in the main memory? Can we easily implement such an algorithm on top of an existing framework? How would we run it? And how much ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
How do we scale up the inference of graphical models to billions of nodes and edges? How do we, or can we even, implement an inference algorithm for graphs that do not fit in the main memory? Can we easily implement such an algorithm on top of an existing framework? How would we run it? And how much time will it save us? In this paper, we tackle this collection of problems through an efficient parallel algorithm for Belief Propagation(BP) that we developed for sparse billionscale graphs using the Hadoop platform. Inference problems on graphical models arise in many scientific domains; BP is an efficient algorithm that has successfully solved many of those problems. We have discovered and we will demonstrate that this useful algorithm can be implemented on top of an existing framework — the crucial observation in the discovery is that the message update process in BP is essentially a special case of GIMV(Generalized Iterative MatrixVector multiplication) [10], a primitive for large scale graph mining, on a line graph induced from the original graph. We show how we formulate the BP algorithm as a variant of GIMV, and present an efficient algorithm. We experiment with our parallelized algorithm on the largest publicly available Web Graphs from Yahoo!, with about 6.7 billion edges, on M45, one of the top 50 fastest supercomputers in the world, and compare the running time with that of a singlemachine, diskbased BP algorithm.
Parallel Splash Belief Propagation
"... As computer architectures transition towards exponentially increasing parallelism we are forced to adopt parallelism at a fundamental level in the design of machine learning algorithms. In this paper we focus on parallel graphical model inference. We demonstrate that the natural, synchronous paralle ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
As computer architectures transition towards exponentially increasing parallelism we are forced to adopt parallelism at a fundamental level in the design of machine learning algorithms. In this paper we focus on parallel graphical model inference. We demonstrate that the natural, synchronous parallelization of belief propagation is highly inefficient. By bounding the achievable parallel performance in chain graphical models we develop a theoretical understanding of the parallel limitations of belief propagation. We then provide a new parallel belief propagation algorithm which achieves optimal performance. Using several challenging realworld tasks, we empirically evaluate the performance of our algorithm on large cyclic graphical models where we achieve near linear parallel scaling and out perform alternative algorithms.
Parallelization of Belief Propagation Method on Embedded Multicore Processors for Stereo Vision
"... Markov random field models provide a robust formulation of lowlevel vision problems. Among the problems, stereo vision remains the most investigated field. The belief propagation provides accurate result in stereo vision problems, however, the algorithm remains slow for practical use. In this paper ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Markov random field models provide a robust formulation of lowlevel vision problems. Among the problems, stereo vision remains the most investigated field. The belief propagation provides accurate result in stereo vision problems, however, the algorithm remains slow for practical use. In this paper we examine and extract the parallelisms in the belief propagation method for stereo vision on multicore processors. The results show that with parallelization exploration on multicore processors, the belief propagation algorithm can have a 13.5 times speedup compared to the single processor implementation. The experimental results also indicate that the parallelized belief propagation algorithm on multicore processors is able to provide a frame rate in 6 frames per second. 1
Statistical Applications in Genetics and Molecular Biology A ModelBased Analysis to Infer the Functional Content of a Gene List A ModelBased Analysis to Infer the Functional Content of a Gene List
"... Abstract An important challenge in statistical genomics concerns integrating experimental data with exogenous information about gene function. A number of statistical methods are available to address this challenge, but most do not accommodate complexities in the functional record. To infer activit ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract An important challenge in statistical genomics concerns integrating experimental data with exogenous information about gene function. A number of statistical methods are available to address this challenge, but most do not accommodate complexities in the functional record. To infer activity of a functional category (e.g., a gene ontology term), most methods use genelevel data on that category, but do not use other functional properties of the same genes. Not doing so creates undue errors in inference. Recent developments in modelbased category analysis aim to overcome this difficulty, but in attempting to do so they are faced with serious computational problems. This paper investigates statistical properties and the structure of posterior computation in one such model for the analysis of functional category data. We examine the graphical structures underlying posterior computation in the original parameterization and in a new parameterization aimed at leveraging elements of the model. We characterize identifiability of the underlying activation states, describe a new prior distribution, and introduce approximations that aim to support numerical methods for posterior inference.
Data Parallelism for Belief Propagation in Factor Graphs
"... Abstract—We investigate data parallelism for belief propagation in acyclic factor graphs on multicore/manycore processors. Belief propagation is a key problem in exploring factor graphs, a probabilistic graphical model that has found applications in many domains. In this paper, we identify basic ope ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—We investigate data parallelism for belief propagation in acyclic factor graphs on multicore/manycore processors. Belief propagation is a key problem in exploring factor graphs, a probabilistic graphical model that has found applications in many domains. In this paper, we identify basic operations called node level primitives for updating the distribution tables in a factor graph. We develop algorithms for these primitives to explore data parallelism. We also propose a complete belief propagation algorithm to perform exact inference in such graphs. We implement the proposed algorithms on stateoftheart multicore processors and show that the proposed algorithms exhibit good scalability using a representative set of factor graphs. On a 32core Intel NehalemEX based system, we achieve 30 × speedup for the primitives and 29 × for the complete algorithm using factor graphs with large distribution tables. I.
DDDooowwwnnnllloooaaaddd DDDaaattteee   444///555///111222 999:::222222 PPPMMMA ModelBased Analysis to Infer the Functional Content of a Gene List
"... An important challenge in statistical genomics concerns integrating experimental data with exogenous information about gene function. A number of statistical methods are available to address this challenge, but most do not accommodate complexities in the functional record. To infer activity of a fun ..."
Abstract
 Add to MetaCart
(Show Context)
An important challenge in statistical genomics concerns integrating experimental data with exogenous information about gene function. A number of statistical methods are available to address this challenge, but most do not accommodate complexities in the functional record. To infer activity of a functional category (e.g., a gene ontology term), most methods use genelevel data on that category, but do not use other functional properties of the same genes. Not doing so creates undue errors in inference. Recent developments in modelbased category analysis aim to overcome this difficulty, but in attempting to do so they are faced with serious computational problems. This paper investigates statistical properties and the structure of posterior computation in one such model for the analysis of functional category data. We examine the graphical structures underlying posterior computation in the original parameterization and in a new parameterization aimed at leveraging elements of the model. We characterize identifiability of the underlying activation states, describe a new prior distribution, and introduce approximations that aim to support numerical methods for posterior inference.
Thesis Proposal Parallel Learning and Inference in Probabilistic Graphical Models
, 2010
"... Probabilistic graphical models are one of the most influential and widely used techniques in machine learning. Powered by exponential gains in processor technology, graphical models have been successfully applied to a wide range of increasingly large and complex realworld problems. However, recent ..."
Abstract
 Add to MetaCart
(Show Context)
Probabilistic graphical models are one of the most influential and widely used techniques in machine learning. Powered by exponential gains in processor technology, graphical models have been successfully applied to a wide range of increasingly large and complex realworld problems. However, recent developments in computer architecture, largescale computing, and datastorage have shifted the focus away from sequential performance scaling and towards parallelism and largescale distributed systems. Therefore, in order for graphical models to continue to benefit from developments in computer architecture and remain a viable option in the clouds and beyond, we must discover and exploit the parallelism of learning and inference in probabilistic graphical models. In this thesis we explore how to design efficient parallel algorithms for probabilistic graphical models by framing learning and inference as iterative adaptive asynchronous computation. We first present our work on efficient parallel algorithms for loopy belief propagation and Gibbs sampling. We then describe GraphLab, a new parallel abstraction for designing and implementing iterative adaptive asynchronous computation. Finally, we conclude with