Results 1 
9 of
9
Computing the M most probable modes of a graphical model
 In Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS
, 2013
"... We introduce the MModes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. MModes have multiple possible applications: because they are intrinsically diverse, they provide a principled alt ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We introduce the MModes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. MModes have multiple possible applications: because they are intrinsically diverse, they provide a principled alternative to nonmaximum suppression techniques for structured prediction, they can act as codebook vectors for quantizing the configuration space, or they can form component centers for mixture model approximation. We present two algorithms for solving the MModes problem. The first algorithm solves the problem in polynomial time when the underlying graphical model is a simple chain. The second algorithm solves the problem for junction chains. In synthetic and real dataset, we demonstrate how MModes can improve the performance of prediction. We also use the generated modes as a tool to understand the topography of the probability distribution of configurations, for example with relation to the training set size and amount of noise in the data. 1
Efficiently enforcing diversity in multioutput structured prediction.
 In AISTATS,
, 2014
"... Abstract This paper proposes a novel method for efficiently generating multiple diverse predictions for structured prediction problems. Existing methods like SDPPs or DivMBest work by making a series of predictions where each prediction is made after considering the predictions that came before it. ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract This paper proposes a novel method for efficiently generating multiple diverse predictions for structured prediction problems. Existing methods like SDPPs or DivMBest work by making a series of predictions where each prediction is made after considering the predictions that came before it. Such approaches are inherently sequential and computationally expensive. In contrast, our method, Diverse Multiple Choice Learning, learns a set of models to make multiple independent, yet diverse, predictions at testtime. We achieve this by including a diversity encouraging term in the loss function used for training the models. This approach encourages diversity in the predictions while preserving computational efficiency at testtime. Experimental results on a number of challenging problems show that our method learns models that not only predict more diverse results than competing methods, but are also able to generalize better and produce results with high test accuracy.
Submodular meets Structured: Finding Diverse Subsets in ExponentiallyLarge Structured Item Sets
"... To cope with the high level of ambiguity faced in domains such as Computer Vision or Natural Language processing, robust prediction methods often search for a diverse set of highquality candidate solutions or proposals. In structured prediction problems, this becomes a daunting task, as the solutio ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
To cope with the high level of ambiguity faced in domains such as Computer Vision or Natural Language processing, robust prediction methods often search for a diverse set of highquality candidate solutions or proposals. In structured prediction problems, this becomes a daunting task, as the solution space (image labelings, sentence parses, etc.) is exponentially large. We study greedy algorithms for finding a diverse subset of solutions in structuredoutput spaces by drawing new connections between submodular functions over combinatorial item sets and HighOrder Potentials (HOPs) studied for graphical models. Specifically, we show via examples that when marginal gains of submodular diversity functions allow structured representations, this enables efficient (sublinear time) approximate maximization by reducing the greedy augmentation step to inference in a factor graph with appropriately constructed HOPs. We discuss benefits, tradeoffs, and show that our constructions lead to significantly better proposals. 1
Learning with Maximum APosteriori Perturbation Models
"... Perturbation models are families of distributions induced from perturbations. They combine randomization of the parameters with maximization to draw unbiased samples. Unlike Gibbs ’ distributions, a perturbation model defined on the basis of low order statistics still gives rise to high order dep ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Perturbation models are families of distributions induced from perturbations. They combine randomization of the parameters with maximization to draw unbiased samples. Unlike Gibbs ’ distributions, a perturbation model defined on the basis of low order statistics still gives rise to high order dependencies. In this paper, we analyze, extend and seek to estimate such dependencies from data. In particular, we shift the modelling focus from the parameters of the Gibbs ’ distribution used as a base model to the space of perturbations. We estimate dependent perturbations over the parameters using a hardEM approach, cast in the form of inverse convex programs. Each inverse program confines the randomization to the parameter polytope responsible for generating the observed answer. We illustrate the method on several computer vision problems. 1
MBestDiverse Labelings for Submodular Energies and Beyond
"... Abstract We consider the problem of finding M best diverse solutions of energy minimization problems for graphical models. Contrary to the sequential method of Batra et al., which greedily finds one solution after another, we infer all M solutions jointly. It was shown recently that such jointly in ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract We consider the problem of finding M best diverse solutions of energy minimization problems for graphical models. Contrary to the sequential method of Batra et al., which greedily finds one solution after another, we infer all M solutions jointly. It was shown recently that such jointly inferred labelings not only have smaller total energy but also qualitatively outperform the sequentially obtained ones. The only obstacle for using this new technique is the complexity of the corresponding inference problem, since it is considerably slower algorithm than the method of Batra et al. In this work we show that the joint inference of M best diverse solutions can be formulated as a submodular energy minimization if the original MAPinference problem is submodular, hence fast inference techniques can be used. In addition to the theoretical results we provide practical algorithms that outperform the current stateoftheart and can be used in both submodular and nonsubmodular case.
The More the Merrier: Parameter Learning for Graphical Models with Multiple MAPs
"... Conditional random field (CRFs) is a popular and effective approach to structured prediction. When the underlying structure does not have a small treewidth, maximum likelihood estimation (MLE) is in general computationally hard. Discriminative methods such as Perceptron or MaxMargin Markov Net ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Conditional random field (CRFs) is a popular and effective approach to structured prediction. When the underlying structure does not have a small treewidth, maximum likelihood estimation (MLE) is in general computationally hard. Discriminative methods such as Perceptron or MaxMargin Markov Networks circumvent this problem by requiring the MAP assignment only, which is often more tractable, either exactly or approximately with linear programming (LP) relaxations. In this paper, we propose an approximate learning method for MLE of CRFs. We leverage LP relaxations to find multiple diverse MAP solutions and use them to approximate the intractable partition function. The proposed approach is easy to parallelize, and yields competitive performance in test accuracies on several structured prediction tasks. 1.
Joint MBestDiverse Labelings as a Parametric Submodular Minimization
"... Abstract We consider the problem of jointly inferring the M best diverse labelings for a binary (highorder) submodular energy of a graphical model. Recently, it was shown that this problem can be solved to a global optimum, for many practically interesting diversity measures. It was noted that th ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We consider the problem of jointly inferring the M best diverse labelings for a binary (highorder) submodular energy of a graphical model. Recently, it was shown that this problem can be solved to a global optimum, for many practically interesting diversity measures. It was noted that the labelings are, socalled, nested. This nestedness property also holds for labelings of a class of parametric submodular minimization problems, where different values of the global parameter γ give rise to different solutions. The popular example of the parametric submodular minimization is the monotonic parametric maxflow problem, which is also widely used for computing multiple labelings. As the main contribution of this work we establish a close relationship between diversity with submodular energies and the parametric submodular minimization. In particular, the joint M best diverse labelings can be obtained by running a nonparametric submodular minimization (in the special case maxflow) solver for M different values of γ in parallel, for certain diversity measures. Importantly, the values for γ can be computed in a closed form in advance, prior to any optimization. These theoretical results suggest two simple yet efficient algorithms for the joint M best diverse problem, which outperform competitors in terms of runtime and quality of results. In particular, as we show in the paper, the new methods compute the exact M best diverse labelings faster than a popular method of Batra et al., which in some sense only obtains approximate solutions.
TABLE OF CONTENTS
"... Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the aut ..."
Abstract
 Add to MetaCart
(Show Context)
Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.