Results 1  10
of
24
Diverse MBest Solutions in Markov Random Fields
"... Abstract. Much effort has been directed at algorithms for obtaining the highest probability (MAP) configuration in probabilistic (random field) models. In many situations, one could benefit from additional highprobability solutions. Current methods for computing the M most probable configurations pr ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Much effort has been directed at algorithms for obtaining the highest probability (MAP) configuration in probabilistic (random field) models. In many situations, one could benefit from additional highprobability solutions. Current methods for computing the M most probable configurations produce solutions that tend to be very similar to the MAP solution and each other. This is often an undesirable property. In this paper we propose an algorithm for the Diverse MBest problem, which involves finding a diverse set of highly probable solutions under a discrete probabilistic model. Given a dissimilarity function measuring closeness of two solutions, our formulation involves maximizing a linear combination of the probability and dissimilarity to previous solutions. Our formulation generalizes the MBest MAP problem and we show that for certain families of dissimilarity functions we can guarantee that these solutions can be found as easily as the MAP solution. 1
Discriminative reranking of diverse segmentations
 In CVPR
, 2013
"... This paper introduces a twostage approach to semantic image segmentation. In the first stage a probabilistic model generates a set of diverse plausible segmentations. In the second stage, a discriminatively trained reranking model selects the best segmentation from this set. The reranking stage c ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
This paper introduces a twostage approach to semantic image segmentation. In the first stage a probabilistic model generates a set of diverse plausible segmentations. In the second stage, a discriminatively trained reranking model selects the best segmentation from this set. The reranking stage can use much more complex features than what could be tractably used in the probabilistic model, allowing a better exploration of the solution space than possible by simply producing the most probable solution from the probabilistic model. While our proposed approach already achieves stateoftheart results (48.1%) on the challenging VOC 2012 dataset, our machine and human analyses suggest that even larger gains are possible with such an approach. 1.
Expectation Truncation And the Benefits of Preselection in Training Generative Models
 Journal of Machine Learning Research
, 2010
"... We show how a preselection of hidden variables can be used to efficiently train generative models with binary hidden variables. The approach is based on Expectation Maximization (EM) and uses an efficiently computable approximation to the sufficient statistics of a given model. The computational cos ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
We show how a preselection of hidden variables can be used to efficiently train generative models with binary hidden variables. The approach is based on Expectation Maximization (EM) and uses an efficiently computable approximation to the sufficient statistics of a given model. The computational cost to compute the sufficient statistics is strongly reduced by selecting, for each data point, the relevant hidden causes. The approximation is applicable to a wide range of generative models and provides an interpretation of the benefits of preselection in terms of a variational EM approximation. To empirically show that the method maximizes the data likelihood, it is applied to different types of generative models including: a version of nonnegative matrix factorization (NMF), a model for nonlinear component extraction (MCA), and a linear generative model similar to sparse coding. The derived algorithms are applied to both artificial and realistic data, and are compared to other models in the literature. We find that the training scheme can reduce computational costs by orders of magnitude and allows for a reliable extraction of hidden causes.
Bucket and minibucket Schemes for M Best Solutions over Graphical Models
"... The paper focuses on finding the m best solutions of a combinatorial optimization problem defined over a graphical model (e.g., the m most probable explanations for a Bayesian network). We describe elimmopt, a new bucket elimination algorithm for solving the mbest task, provide efficient implement ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
The paper focuses on finding the m best solutions of a combinatorial optimization problem defined over a graphical model (e.g., the m most probable explanations for a Bayesian network). We describe elimmopt, a new bucket elimination algorithm for solving the mbest task, provide efficient implementation of its defining combination and marginalization operators, analyze its worstcase performance, and compare it with that of recent related algorithms. An extension to the minibucket framework, yielding a collection of bounds for each of the mbest solutions is discussed and empirically evaluated. We also formulate the mbest task as a regular reasoning task over general graphical models defined axiomatically, which makes all other inference algorithms applicable. 1
A Delayed Column Generation Strategy for Exact kBounded MAP Inference in Markov Logic Networks
"... The paper introduces kbounded MAP inference, a parameterization of MAP inference in Markov logic networks. kBounded MAP states are MAP states with at most k active ground atoms of hidden (nonevidence) predicates. We present a novel delayed column generation algorithm and provide empirical evidenc ..."
Abstract

Cited by 10 (8 self)
 Add to MetaCart
(Show Context)
The paper introduces kbounded MAP inference, a parameterization of MAP inference in Markov logic networks. kBounded MAP states are MAP states with at most k active ground atoms of hidden (nonevidence) predicates. We present a novel delayed column generation algorithm and provide empirical evidence that the algorithm efficiently computes kbounded MAP states for meaningful realworld graph matching problems. The underlying idea is that, instead of solvingonelargeoptimizationproblem,itisoften more efficient to tackle several small ones. 1
Multiple choice learning: Learning to produce multiple structured outputs
 In NIPS
, 2012
"... We address the problem of generating multiple hypotheses for structured prediction tasks that involve interaction with users or successive components in a cascaded architecture. Given a set of multiple hypotheses, such components/users typically have the ability to retrieve the best (or approximatel ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
We address the problem of generating multiple hypotheses for structured prediction tasks that involve interaction with users or successive components in a cascaded architecture. Given a set of multiple hypotheses, such components/users typically have the ability to retrieve the best (or approximately the best) solution in this set. The standard approach for handling such a scenario is to first learn a singleoutput model and then produce MBest Maximum a Posteriori (MAP) hypotheses from this model. In contrast, we learn to produce multiple outputs by formulating this task as a multipleoutput structuredoutput prediction problem with a lossfunction that effectively captures the setup of the problem. We present a maxmargin formulation that minimizes an upperbound on this lossfunction. Experimental results on image segmentation and protein sidechain prediction show that our method outperforms conventional approaches used for this type of scenario and leads to substantial improvements in prediction accuracy.
An Efficient MessagePassing Algorithm for the MBest MAP Problem
"... Much effort has been directed at algorithms for obtaining the highest probability configuration in a probabilistic random field model – known as the maximum a posteriori (MAP) inference problem. In many situations, one could benefit from having not just a single solution, but the top M most probable ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Much effort has been directed at algorithms for obtaining the highest probability configuration in a probabilistic random field model – known as the maximum a posteriori (MAP) inference problem. In many situations, one could benefit from having not just a single solution, but the top M most probable solutions – known as the MBest MAP problem. In this paper, we propose an efficient messagepassing based algorithm for solving the MBest MAP problem. Specifically, our algorithm solves the recently proposed Linear Programming (LP) formulation of MBest MAP [7], while being orders of magnitude faster than a generic LPsolver. Our approach relies on studying a particular partial Lagrangian relaxation of the MBest MAP LP which exposes a natural combinatorial structure of the problem that we exploit. 1
Computing the M most probable modes of a graphical model
 In Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS
, 2013
"... We introduce the MModes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. MModes have multiple possible applications: because they are intrinsically diverse, they provide a principled alt ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
We introduce the MModes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. MModes have multiple possible applications: because they are intrinsically diverse, they provide a principled alternative to nonmaximum suppression techniques for structured prediction, they can act as codebook vectors for quantizing the configuration space, or they can form component centers for mixture model approximation. We present two algorithms for solving the MModes problem. The first algorithm solves the problem in polynomial time when the underlying graphical model is a simple chain. The second algorithm solves the problem for junction chains. In synthetic and real dataset, we demonstrate how MModes can improve the performance of prediction. We also use the generated modes as a tool to understand the topography of the probability distribution of configurations, for example with relation to the training set size and amount of noise in the data. 1
Tighter Linear Program Relaxations for High Order Graphical Models
"... Graphical models with High Order Potentials (HOPs) have received considerable interest in recent years. While there are a variety of approaches to inference in these models, nearly all of them amount to solving a linear program (LP) relaxation with unary consistency constraints between the HOP and t ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Graphical models with High Order Potentials (HOPs) have received considerable interest in recent years. While there are a variety of approaches to inference in these models, nearly all of them amount to solving a linear program (LP) relaxation with unary consistency constraints between the HOP and the individual variables. In many cases, the resulting relaxations are loose, and in these cases the results of inference can be poor. It is thus desirable to look for more accurate ways of performing inference. In this work, we study the LP relaxations that result from enforcing additional consistency constraints between the HOP and the rest of the model. We address theoretical questions about the strength of the resulting relaxations compared to the relaxations that arise in standard approaches, and we develop practical and efficient message passing algorithms for optimizing the LPs. Empirically, we show that the LPs with additional consistency constraints lead to more accurate inference on some challenging problems that include a combination of low order and high order terms. 1
DivMCuts: Faster Training of Structural SVMs with Diverse MBest CuttingPlanes
"... Training of Structural SVMs involves solving a large Quadratic Program (QP). One popular method for solving this QP is a cuttingplane approach, where the most violated constraint is iteratively added to a workingset of constraints. Unfortunately, training models with a large number of parameters r ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Training of Structural SVMs involves solving a large Quadratic Program (QP). One popular method for solving this QP is a cuttingplane approach, where the most violated constraint is iteratively added to a workingset of constraints. Unfortunately, training models with a large number of parameters remains a time consuming process. This paper shows that significant computational savings can be achieved by adding multiple diverse and highly violated constraints at every iteration of the cuttingplane algorithm. We show that generation of such diverse cuttingplanes involves extracting diverse MBest solutions from the lossaugmented score of the training instances. To find these diverse MBest solutions, we employ a recently proposed algorithm [4]. Our experiments on image segmentation and protein sidechain prediction show that the proposed approach can lead to significant computational savings, e.g., ∼28 % reduction in training time.