Results 1  10
of
33
Online Alternating Direction Method
 In ICML
, 2012
"... Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and ..."
Abstract

Cited by 39 (9 self)
 Add to MetaCart
(Show Context)
Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and forms the basis of regret analysis in the online setting. We consider two scenarios in the online setting, based on whether the solution needs to lie in the feasible set or not. In both settings, we establish regret bounds for both the objective function as well as constraint violation for general and strongly convex functions. Preliminary results are presented to illustrate the performance of the proposed algorithms. 1.
Hingeloss Markov Random Fields: Convex Inference for Structured Prediction
 In Uncertainty in Artificial Intelligence
, 2013
"... Graphical models for structured domains are powerful tools, but the computational complexities of combinatorial prediction spaces can force restrictions on models, or require approximate inference in order to be tractable. Instead of working in a combinatorial space, we use hingeloss Markov random ..."
Abstract

Cited by 26 (18 self)
 Add to MetaCart
(Show Context)
Graphical models for structured domains are powerful tools, but the computational complexities of combinatorial prediction spaces can force restrictions on models, or require approximate inference in order to be tractable. Instead of working in a combinatorial space, we use hingeloss Markov random fields (HLMRFs), an expressive class of graphical models with logconcave density functions over continuous variables, which can represent confidences in discrete predictions. This paper demonstrates that HLMRFs are general tools for fast and accurate structured prediction. We introduce the first inference algorithm that is both scalable and applicable to the full class of HLMRFs, and show how to train HLMRFs with several learning algorithms. Our experiments show that HLMRFs match or surpass the predictive performance of stateoftheart methods, including discrete models, in four application domains. 1
Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization
"... Probabilistic graphical models are powerful tools for analyzing constrained, continuous domains. However, finding mostprobable explanations (MPEs) in these models can be computationally expensive. In this paper, we improve the scalability of MPE inference in a class of graphical models with piecewi ..."
Abstract

Cited by 17 (14 self)
 Add to MetaCart
(Show Context)
Probabilistic graphical models are powerful tools for analyzing constrained, continuous domains. However, finding mostprobable explanations (MPEs) in these models can be computationally expensive. In this paper, we improve the scalability of MPE inference in a class of graphical models with piecewiselinear and piecewisequadratic dependencies and linear constraints over continuous domains. We derive algorithms based on a consensusoptimization framework and demonstrate their superior performance over state of the art. We show empirically that in a largescale voterpreference modeling problem our algorithms scale linearly in the number of dependencies and constraints. 1
Efficiently Searching for Frustrated Cycles in MAP Inference
"... Dual decomposition provides a tractable framework for designing algorithms for finding the most probable (MAP) configuration in graphical models. However, for many realworld inference problems, the typical decomposition has a large integrality gap, due to frustrated cycles. One way to tighten the r ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
Dual decomposition provides a tractable framework for designing algorithms for finding the most probable (MAP) configuration in graphical models. However, for many realworld inference problems, the typical decomposition has a large integrality gap, due to frustrated cycles. One way to tighten the relaxation is to introduce additional constraints that explicitly enforce cycle consistency. Earlier work showed that clusterpursuit algorithms, which iteratively introduce cycle and other higherorder consistency constraints, allows one to exactly solve many hard inference problems. However, these algorithms explicitly enumerate a candidate set of clusters, limiting them to triplets or other short cycles. We solve the search problem for cycle constraints, giving a nearly linear time algorithm for finding the most frustrated cycle of arbitrary length. We show how to use this search algorithm together with the dual decomposition framework and clusterpursuit. The new algorithm exactly solves MAP inference problems arising from relational classification and stereo vision. 1
Convergence rate analysis of MAP coordinate minimization algorithms
 In NIPS. 2012
"... Finding maximum a posteriori (MAP) assignments in graphical models is an important task in many applications. Since the problem is generally hard, linear programming (LP) relaxations are often used. Solving these relaxations efficiently is thus an important practical problem. In recent years, seve ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
Finding maximum a posteriori (MAP) assignments in graphical models is an important task in many applications. Since the problem is generally hard, linear programming (LP) relaxations are often used. Solving these relaxations efficiently is thus an important practical problem. In recent years, several authors have proposed message passing updates corresponding to coordinate descent in the dual LP. However, these are generally not guaranteed to converge to a global optimum. One approach to remedy this is to smooth the LP, and perform coordinate descent on the smoothed dual. However, little is known about the convergence rate of this procedure. Here we perform a thorough rate analysis of such schemes and derive primal and dual convergence rates. We also provide a simple dual to primal mapping that yields feasible primal solutions with a guaranteed rate of convergence. Empirical evaluation supports our theoretical claims and shows that the method is highly competitive with state of the art approaches that yield global optima. 1
BetheADMM for Tree Decomposition based Parallel MAP Inference
"... We consider the problem of maximum a posteriori (MAP) inference in discrete graphical models. We present a parallel MAP inference algorithm called BetheADMM based on two ideas: treedecomposition of the graph and the alternating direction method of multipliers (ADMM). However, unlike the standard A ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
We consider the problem of maximum a posteriori (MAP) inference in discrete graphical models. We present a parallel MAP inference algorithm called BetheADMM based on two ideas: treedecomposition of the graph and the alternating direction method of multipliers (ADMM). However, unlike the standard ADMM, we use an inexact ADMM augmented with a Bethedivergence based proximal function, which makes each subproblem in ADMM easy to solve in parallel using the sumproduct algorithm. We rigorously prove global convergence of BetheADMM. The proposed algorithm is extensively evaluated on both synthetic and real datasets to illustrate its effectiveness. Further, the parallel BetheADMM is shown to scale almost linearly with increasing number of cores. 1
Globally Convergent Dual MAP LP Relaxation Solvers using FenchelYoung Margins
"... While finding the exact solution for the MAP inference problem is intractable for many realworld tasks, MAP LP relaxations have been shown to be very effective in practice. However, the most efficient methods that perform block coordinate descent can get stuck in suboptimal points as they are not ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
While finding the exact solution for the MAP inference problem is intractable for many realworld tasks, MAP LP relaxations have been shown to be very effective in practice. However, the most efficient methods that perform block coordinate descent can get stuck in suboptimal points as they are not globally convergent. In this work we propose to augment these algorithms with an ɛdescent approach and present a method to efficiently optimize for a descent direction in the subdifferential using a marginbased formulation of the FenchelYoung duality theorem. Furthermore, the presented approach provides a methodology to construct a primal optimal solution from its dual optimal counterpart. We demonstrate the efficiency of the presented approach on spin glass models and protein interaction problems and show that our approach outperforms stateoftheart solvers. 1
Generalized sequential treereweighted message passing
 arXiv:1205.6352
"... This paper addresses the problem of approximate MAPMRF inference in general graphical models. Following [36], we consider a family of linear programming relaxations of the problem where each relaxation is specified by a set of nested pairs of factors for which the marginalization constraint needs t ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of approximate MAPMRF inference in general graphical models. Following [36], we consider a family of linear programming relaxations of the problem where each relaxation is specified by a set of nested pairs of factors for which the marginalization constraint needs to be enforced. We develop a generalization of the TRWS algorithm [9] for this problem, where we use a decomposition into junction chains, monotonic w.r.t. some ordering on the nodes. This generalizes the monotonic chains in [9] in a natural way. We also show how to deal with nested factors in an efficient way. Experiments show an improvement over minsum diffusion, MPLP and subgradient ascent algorithms on a number of computer vision and natural language processing problems. 1
C.: Global MAPoptimality by shrinking the combinatorial search area with convex relaxation
, 2013
"... We consider energy minimization for undirected graphical models, also known as the MAPinference problem for Markov random fields. Although combinatorial methods, which return a provably optimal integral solution of the problem, made a significant progress in the past decade, they are still typicall ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We consider energy minimization for undirected graphical models, also known as the MAPinference problem for Markov random fields. Although combinatorial methods, which return a provably optimal integral solution of the problem, made a significant progress in the past decade, they are still typically unable to cope with largescale datasets. On the other hand, large scale datasets are often defined on sparse graphs and convex relaxation methods, such as linear programming relaxations then provide good approximations to integral solutions. We propose a novel method of combining combinatorial and convex programming techniques to obtain a global solution of the initial combinatorial problem. Based on the information obtained from the solution of the convex relaxation, our method confines application of the combinatorial solver to a small fraction of the initial graphical model, which allows to optimally solve much larger problems. We demonstrate the efficacy of our approach on a computer vision energy minimization benchmark. 1
Hingeloss Markov random fields and probabilistic soft logic
, 2015
"... A fundamental challenge in developing highimpact machine learning technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
A fundamental challenge in developing highimpact machine learning technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this paper, we introduce two new formalisms for modeling structured data, distinguished from previous approaches by their ability to both capture rich structure and scale to big data. The first, hingeloss Markov random fields (HLMRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. We unite three approaches from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three lead to the same inference objective. We then derive HLMRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HLMRFs easy to define using a syntax based on firstorder logic. We next introduce an algorithm for inferring mostprobable variable assignments (MAP inference) that is much more scalable than generalpurpose convex optimization software, because it uses message passing to take advantage of sparse dependency structures. We then show how to learn the parameters of HLMRFs. The learned HLMRFs are as accurate as analogous discrete models, but much more scalable. Together, these algorithms enable HLMRFs and PSL to model rich, structured data at scales not previously possible.