Results 1  10
of
16
Image Segmentation by Data Driven Markov Chain Monte Carlo
, 2001
"... This paper presents a computational paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for image segmentation in the Bayesian statistical framework. The paper contributes to image segmentation in three aspects. Firstly, it designs effective and well balanced Markov Chain dynamics to exp ..."
Abstract

Cited by 269 (32 self)
 Add to MetaCart
(Show Context)
This paper presents a computational paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for image segmentation in the Bayesian statistical framework. The paper contributes to image segmentation in three aspects. Firstly, it designs effective and well balanced Markov Chain dynamics to explore the solution space and makes the split and merge process reversible at a middle level vision formulation. Thus it achieves globally optimal solution independent of initial segmentations. Secondly, instead of computing a single maximum a posteriori solution, it proposes a mathematical principle for computing multiple distinct solutions to incorporates intrinsic ambiguities in image segmentation. A kadventurers algorithm is proposed for extracting distinct multiple solutions from the Markov chain sequence. Thirdly, it utilizes datadriven (bottomup) techniques, such as clustering and edge detection, to compute importance proposal probabilities, which eectively drive the Markov chain dynamics and achieve tremendous speedup in comparison to traditional jumpdiffusion method[4]. Thus DDMCMC paradigm provides a unifying framework where the role of existing segmentation algorithms, such as, edge detection, clustering, region growing, splitmerge, SNAKEs, region competition, are revealed as either realizing Markov chain dynamics or computing importance proposal probabilities. We report some results on color and grey level image segmentation in this paper and refer to a detailed report and a web site for extensive discussion.
A comment on contrastive divergence
 Proc. of NIPS
, 2004
"... This paper analyses the Contrastive Divergence algorithm for learning statistical parameters. We relate the algorithm to the stochastic approximation literature. This enables us to specify conditions under which the algorithm is guaranteed to converge to the optimal solution (with probability 1). Th ..."
Abstract

Cited by 37 (0 self)
 Add to MetaCart
This paper analyses the Contrastive Divergence algorithm for learning statistical parameters. We relate the algorithm to the stochastic approximation literature. This enables us to specify conditions under which the algorithm is guaranteed to converge to the optimal solution (with probability 1). This includes necessary and sufficient conditions for the solution to be unbiased. 1
Learning in markov random fields using tempered transitions
 In Advances in Neural Information Processing Systems
"... Markov random fields (MRF’s), or undirected graphical models, provide a powerful framework for modeling complex dependencies among random variables. Maximum likelihood learning in MRF’s is hard due to the presence of the global normalizing constant. In this paper we consider a class of stochastic ap ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
(Show Context)
Markov random fields (MRF’s), or undirected graphical models, provide a powerful framework for modeling complex dependencies among random variables. Maximum likelihood learning in MRF’s is hard due to the presence of the global normalizing constant. In this paper we consider a class of stochastic approximation algorithms of the RobbinsMonro type that use Markov chain Monte Carlo to do approximate maximum likelihood learning. We show that using MCMC operators based on tempered transitions enables the stochastic approximation algorithm to better explore highly multimodal distributions, which considerably improves parameter estimates in large, denselyconnected MRF’s. Our results on MNIST and NORB datasets demonstrate that we can successfully learn good generative models of highdimensional, richly structured data that perform well on digit and object recognition tasks. 1
Herding Dynamical Weights to Learn
"... A new “herding ” algorithm is proposed which directly converts observed moments into a sequence of pseudosamples. The pseudosamples respect the moment constraints and may be used to estimate (unobserved) quantities of interest. The procedure allows us to sidestep the usual approach of first learnin ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
(Show Context)
A new “herding ” algorithm is proposed which directly converts observed moments into a sequence of pseudosamples. The pseudosamples respect the moment constraints and may be used to estimate (unobserved) quantities of interest. The procedure allows us to sidestep the usual approach of first learning a joint model (which is intractable) and then sampling from that model (which can easily get stuck in a local mode). Moreover, the algorithm is fully deterministic, avoiding random number generation) and does not need expensive operations such as exponentiation. 1.
Herding dynamic weights for partially observed random field models
 In Proc. of the Conf. on Uncertainty in Artificial Intelligence
, 2009
"... Learning the parameters of a (potentially partially observable) random field model is intractable in general. Instead of focussing on a single optimal parameter value we propose to treat parameters as dynamical quantities. We introduce an algorithm to generate complex dynamics for parameters and (bo ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
(Show Context)
Learning the parameters of a (potentially partially observable) random field model is intractable in general. Instead of focussing on a single optimal parameter value we propose to treat parameters as dynamical quantities. We introduce an algorithm to generate complex dynamics for parameters and (both visible and hidden) state vectors. We show that under certain conditions averages computed over trajectories of the proposed dynamical system converge to averages computed over the data. Our “herding dynamics ” does not require expensive operations such as exponentiation and is fully deterministic. 1
1 Learning Graphical Model Parameters with Approximate Marginal Inference
"... Abstract—Likelihood basedlearning of graphical models faces challenges of computationalcomplexity and robustness to model misspecification. This paper studies methods that fit parameters directly to maximize a measure of the accuracy of predicted marginals, taking into account both model and infer ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Likelihood basedlearning of graphical models faces challenges of computationalcomplexity and robustness to model misspecification. This paper studies methods that fit parameters directly to maximize a measure of the accuracy of predicted marginals, taking into account both model and inference approximations at training time. Experiments on imaging problems suggest marginalizationbased learning performs better than likelihoodbased approximations on difficult problems where the model being fit is approximate in nature. 1
Statistical and computational tradeoffs in stochastic composite likelihood
 In 12th International Conference on AI and Statistics
, 2009
"... Maximum likelihood estimators are often of limited practical use due to the intensive computation they require. We propose a family of alternative estimators that maximize a stochastic variation of the composite likelihood function. Each of the estimators resolve the computationaccuracy tradeoff di ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
Maximum likelihood estimators are often of limited practical use due to the intensive computation they require. We propose a family of alternative estimators that maximize a stochastic variation of the composite likelihood function. Each of the estimators resolve the computationaccuracy tradeoff differently, and taken together they span a continuous spectrum of computationaccuracy tradeoff resolutions. We prove the consistency of the estimators, provide formulas for their asymptotic variance, statistical robustness, and computational complexity. We discuss experimental results in the context of Boltzmann machines and conditional random fields. The theoretical and experimental studies demonstrate the effectiveness of the estimators when the computational resources are insufficient. They also demonstrate that in some cases reduced computational complexity is associated with robustness thereby increasing statistical accuracy.
A learningbased method for image superresolution from zoomed observations
 IEEE Transactions on Systems, Man, and Cybernetics, Part B
, 2005
"... We propose a technique for superresolution imaging of a scene from observations at different camera zooms. Given a sequence of images with different zoom factors of a static scene, we obtain a picture of the entire scene at a resolution corresponding to the most zoomed observation. The high resolut ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
We propose a technique for superresolution imaging of a scene from observations at different camera zooms. Given a sequence of images with different zoom factors of a static scene, we obtain a picture of the entire scene at a resolution corresponding to the most zoomed observation. The high resolution image is modeled through appropriate parameterization and the parameters are learnt from the most zoomed observation. Assuming a homogeneity of the high resolution field, the learnt model is used as a prior while superresolving the scene. We suggest the use of either an MRF or an simultaneous autoregressive (SAR) model to parameterize the field based on the computation one can afford. We substantiate the suitability of the proposed method through a large number of experimentations on both simulated and real data.
Bayesian Estimation of Latentlygrouped Parameters in Undirected Graphical Models
"... In largescale applications of undirected graphical models, such as social networks and biological networks, similar patterns occur frequently and give rise to similar parameters. In this situation, it is beneficial to group the parameters for more efficient learning. We show that even when the gro ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
In largescale applications of undirected graphical models, such as social networks and biological networks, similar patterns occur frequently and give rise to similar parameters. In this situation, it is beneficial to group the parameters for more efficient learning. We show that even when the grouping is unknown, we can infer these parameter groups during learning via a Bayesian approach. We impose a Dirichlet process prior on the parameters. Posterior inference usually involves calculating intractable terms, and we propose two approximation algorithms, namely a MetropolisHastings algorithm with auxiliary variables and a Gibbs sampling algorithm with “stripped ” Beta approximation (Gibbs SBA). Simulations show that both algorithms outperform conventional maximum likelihood estimation (MLE). Gibbs SBA’s performance is close to Gibbs sampling with exact likelihood calculation. Models learned with Gibbs SBA also generalize better than the models learned by MLE on realworld Senate voting data. 1
An efficient approach to learning inhomogeneous Gibbs model
 CVPR
"... Inhomogeneous Gibbs model (IGM) [4] is an effective maximum entropy model in characterizing complex highdimensional distributions. However, its training process is so slow that the applicability of IGM has been greatly restricted. In this paper, we propose an approach for fast parameter learning of ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Inhomogeneous Gibbs model (IGM) [4] is an effective maximum entropy model in characterizing complex highdimensional distributions. However, its training process is so slow that the applicability of IGM has been greatly restricted. In this paper, we propose an approach for fast parameter learning of IGM. In IGM learning, features are incrementally constructed to constrain the learnt distribution. When a new feature is added, Markovchain Monte Carlo (MCMC) sampling is repeated to draw samples for parameter learning. In contrast, our new approach constructs a closedform reference distribution using approximate information gain criteria. Because our reference distribution is very close to the optimal one, importance sampling can be used to accelerate the parameter optimization process. For problems with highdimensional distributions, our approach typically achieves a speedup of two orders of magnitude compared to the original IGM. We further demonstrate the efficiency of our approach by learning a highdimensional joint distribution of face images and their corresponding caricatures. 1.