## Convergent message-passing algorithms for inference over general graphs with convex free energy (2008)

Venue: | In The 24th Conference on Uncertainty in Artificial Intelligence (UAI |

Citations: | 15 - 5 self |

### BibTeX

@INPROCEEDINGS{Hazan08convergentmessage-passing,

author = {Tamir Hazan and Amnon Shashua},

title = {Convergent message-passing algorithms for inference over general graphs with convex free energy},

booktitle = {In The 24th Conference on Uncertainty in Artificial Intelligence (UAI},

year = {2008}

}

### OpenURL

### Abstract

Inference problems in graphical models can be represented as a constrained optimization of a free energy function. It is known that when the Bethe free energy is used, the fixedpoints of the belief propagation (BP) algorithm correspond to the local minima of the free energy. However BP fails to converge in many cases of interest. Moreover, the Bethe free energy is non-convex for graphical models with cycles thus introducing great difficulty in deriving efficient algorithms for finding local minima of the free energy for general graphs. In this paper we introduce two efficient BP-like algorithms, one sequential and the other parallel, that are guaranteed to converge to the global minimum, for any graph, over the class of energies known as ”convex free energies”. In addition, we propose an efficient heuristic for setting the parameters of the convex free energy based on the structure of the graph. 1

### Citations

3267 | Convex analysis - Rockafellar - 1970 |

1167 | Factor graphs and the sum-product algorithm
- Kschischang, Frey, et al.
- 2001
(Show Context)
Citation Context ..., ψm and the function ψα(xα) has arguments xα that are some subset of {x1, ..., xn} and Z is a normalization constant. The factorization structure above is conveniently represented by a factor graph (=-=Kschischang et al., 2001-=-) which is a bipartite graph with variable nodes one for each variable xi and a factor node for each function ψα. An edge connects a variable node i with factor node α if and only if xi ∈ xα, i.e., xi... |

465 | Loopy belief propagation for approximate inference: An empirical study
- Murphy, Weiss, et al.
- 1999
(Show Context)
Citation Context ...governing the class of convex free energies. The underlying motivation is borne by the empirical observation from BP practitioners that when BP does converge, the results are often surprisingly good (=-=Murphy et al., 1999-=-). Since our scheme would always converge and the free energy approximation is close to Bethe’s, we would have in some sense a ”convergent BP” for general graphs. 2 Terminology and Problem Setup We co... |

413 | Constructing Free-Energy Approximations and Generalized Belief FALL 2008 105 Spring Symposium Series Call for Participation AAAI presents the 2009 Spring Symposium Series, to be held Monday - Wednesday, March 23-25, 2008, at Stanford University. The topic
- Yedidia, Freeman, et al.
(Show Context)
Citation Context ...ppen, 2005)) and the algorithm often fails to converge. It is known that the fixed-points of the BP algorithm correspond to local minima of a constrained energy function called the Bethe free energy (=-=Yedidia et al., 2005-=-). The free energy arises from the expansion of the KL-divergence between the input distribution and its product form. The Bethe function replaces the entropy term in the free energy by an approximati... |

154 | A new class of upper bounds on the log partition function
- Wainwright, Jaakkola, et al.
- 2005
(Show Context)
Citation Context ...ver, a convergent message passing algorithm for the general class of convex free energies is still lacking. The existing algorithms either employ damping heuristics to ensure convergence in practice (=-=Wainwright et al., 2005-=-) or focus on a sub-class of free energies where the entropy term is a positive combination of joint entropies (Heskes, 2006). In this paper, we derive convergent message-passing algorithms, one seque... |

153 |
Convex Analysis and Optimization; Athena Scientific
- Bertsekas, Nedić, et al.
- 2003
(Show Context)
Citation Context ...es at a costly run-time tradeoff. Fig. 5 compares the running time of our (sequential) MP algorithm with a general convex solver performing conditional gradient descent on the primal energy function (=-=Bertsekas et al., 2003-=-) which uses linear programming to find feasible search directions, and to the CCCP algorithm. We ran all three algorithms on n × n grids where n = 2, 3, ..., 10. The stopping criteria for all algorit... |

108 | CCCP algorithms to minimize the Bethe and Kikuchi free energies: Convergent alternatives to belief propagation
- Yuille
- 2002
(Show Context)
Citation Context ...epresenting validity of marginals). When the factor graph has cycles the Bethe energy is nonconvex and although it is possible to derive convergent algorithms to a local minima of the Bethe function (=-=Yuille, 2002-=-) the computational cost is large and thus has not gained popularity. To overcome the difficulty with the non-convexity of the Bethe approximation, several authors have introduced a class of approxima... |

62 |
An algorithm for restricted least squares regression
- Dykstra
- 1983
(Show Context)
Citation Context ... 1999) of the vector µ i onto the convex set Ci. In that case, following some algebraic manipulations (such as eliminating µ i among other manipulations) the scheme reduces to the well known Dykstra (=-=Dykstra, 1983-=-) (also goes under different names such as Hildreth, Bregman, Csiszar, Han, Tseng) successive projection algorithm which has its origins in the work of Von-Neumann (von Neumann, 1950). We introduce ne... |

57 | On the uniqueness of loopy belief propagation fixed points
- Heskes
(Show Context)
Citation Context ...: ∑ ¯cαH(bα) + ∑ ¯ciH(bi), (1) α where ¯ci = 1 − ∑ α∈N(i) ¯cα. Thus when the coefficients ¯cα = 1 for all factor nodes we obtain the Bethe approximation. A convex free energy is based on a result of (=-=Heskes, 2004-=-) who derived sufficient conditions for an entropy approximation to be convex over the set of constraints. In the setting we have described, those conditions have the following form (Weiss et al., 200... |

26 | Convexity arguments for efficient minimization of the Bethe and Kikuchi free energies
- Heskes
- 2006
(Show Context)
Citation Context ...her employ damping heuristics to ensure convergence in practice (Wainwright et al., 2005) or focus on a sub-class of free energies where the entropy term is a positive combination of joint entropies (=-=Heskes, 2006-=-). In this paper, we derive convergent message-passing algorithms, one sequential and the other parallel, for the general class of convex free energies. The derivation applies to general factor graphs... |

21 | Dual coordinate ascent methods for non-stricly convex minimization - Tseng - 1993 |

20 | 2005b). “Sufficient conditions for convergence of Loopy Belief Propagation
- Mooij, Kappen
(Show Context)
Citation Context ...pularity, is that it often gives surprisingly good approximate results for graphical models with cycles. However, in this context there are no convergence guarantees (except under some special cases (=-=Mooij & Kappen, 2005-=-)) and the algorithm often fails to converge. It is known that the fixed-points of the BP algorithm correspond to local minima of a constrained energy function called the Bethe free energy (Yedidia et... |

14 |
Dykstra’s algorithm as the nonlinear extension of Bregman’s optimization method
- Bregman, Censor, et al.
- 1999
(Show Context)
Citation Context ...e policy. For those familiar with successive projection schemes, in the particular case when hi(b) = δCi(b) (the indicator function of convex set Ci), the update step for b is a ”Bregman” projection (=-=Bregman et al., 1999-=-) of the vector µ i onto the convex set Ci. In that case, following some algebraic manipulations (such as eliminating µ i among other manipulations) the scheme reduces to the well known Dykstra (Dykst... |

12 | Convergent propagation algorithms via oriented trees
- Globerson, Jaakkola
- 2007
(Show Context)
Citation Context ...f free energies defined on spanning trees of the factor graph. It is notable that for this specific member of convex free energies a convergent message-passing algorithm has been recently introduced (=-=Globerson & Jaakkola, 2007-=-b). The algorithm is sequential (unlike BP which has both se-quential and parallel forms) and applies to graphs with pairwise cliques only. However, a convergent message passing algorithm for the gen... |

9 | Approximate inference using conditional entropy decompositions
- Globerson, Jaakkola
- 2007
(Show Context)
Citation Context ...f free energies defined on spanning trees of the factor graph. It is notable that for this specific member of convex free energies a convergent message-passing algorithm has been recently introduced (=-=Globerson & Jaakkola, 2007-=-b). The algorithm is sequential (unlike BP which has both se-quential and parallel forms) and applies to graphs with pairwise cliques only. However, a convergent message passing algorithm for the gen... |

5 | Functional Operators. Vol.II. The Geometry of Orthogonal Spaces, Volume 22 - Neumann - 1950 |

1 | Parallel Block Updates for minb f(b) + ∑ i hi(b) Recall the f(b) is a strictly convex real-valued function and the functions hi are convex, proper and continuous. We quote below two basic theorems from convex duality (cf. (Bertsekas et al., 2003)) which w - Sequential |