## Continuous contour Monte Carlo for marginal density estimation with an application to a spatial statistical model (2007)

Venue: | Journal of Computational and Graphical Statistics |

Citations: | 7 - 3 self |

### BibTeX

@ARTICLE{Liang07continuouscontour,

author = {Faming Liang},

title = {Continuous contour Monte Carlo for marginal density estimation with an application to a spatial statistical model},

journal = {Journal of Computational and Graphical Statistics},

year = {2007},

pages = {608--632}

}

### OpenURL

### Abstract

The problem of marginal density estimation for a multivariate density function f(x) can be generally stated as a problem of density function estimation for a random vector λ(x) of dimension lower than that of x. In this article, we propose a technique, the so-called continuous Contour Monte Carlo (CCMC) algorithm, for solving this problem. CCMC can be viewed as a continuous version of the contour Monte Carlo (CMC) algorithm recently proposed in the literature. CCMC abandons the use of sample space partitioning and incorporates the techniques of kernel density estimation into its simulations. CCMC is more general than other marginal density estimation algorithms. First, it works for any density functions, even for those having a rugged or unbalanced energy landscape. Second, it works for any transformation λ(x) regardless of the availability of the analytical form of the inverse transformation. In this article, CCMC is applied to estimate the unknown normalizing constant function for a spatial autologistic model, and the estimate is then used in a Bayesian analysis for the spatial autologistic model in place of the true normalizing constant function. Numerical results on the U.S. cancer mortality data indicate that the Bayesian method can produce much more accurate estimates than the MPLE and MCMLE methods for the parameters of the spatial autologistic model.

### Citations

8904 | Maximum-likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...e model. Monte Carlo EM (Wei and Tanner, 1990) provides a method for finding the MLE of θ when the conditional expectation in the “E-step” are not available analytically. Otherwise, the EM algorithm (=-=Dempster et al, 1977-=-) can be used. CCMC can work as an alternative to Monte Carlo EM. Once the marginal f(θ|zobs) is available numerically, finding the MLE is trivial. One remarkable advantage of this method is that it a... |

4016 |
Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images
- Geman, Geman
- 1984
(Show Context)
Citation Context ...igh energy barriers. In simulation from such a distribution, conventional MCMC samplers, such as the Metropolis-Hastings (MH) algorithm (Metropolis et al, 1953; Hastings, 1970) and the Gibbs sampler (=-=Geman and Geman, 1984-=-), tend to get trapped in a local energy minimum indefinitely, rendering the simulation ineffective. This difficulty can be alleviated to some extent by employing an advanced MCMC sampler, such as par... |

3214 | An introduction to the bootstrap
- Efron, Tibshirani
- 1993
(Show Context)
Citation Context ...ve correlation between neighboring counties ( ˆ β ≈ 0.1). To compare the quality of the three estimates, we conduct the following experiment based on the principle of the parametric bootstrap method (=-=Efron and Tibshirani, 1993-=-). Let T 1 = � i∈D si/|D| � and � j∈N(i) sj � /|D|. Thus, T = (T 1, T 2) forms a sufficient statistic for the parameters T 2 = 1 2 � i∈D si (α, β). Given an estimate (�α, � β), we can reversely estima... |

2489 |
Equation of state calculations by fast computing machines
- Metropolis, Rosenbluth, et al.
- 1953
(Show Context)
Citation Context ...tion contains a multitude of local energy minima separated by high energy barriers. In simulation from such a distribution, conventional MCMC samplers, such as the Metropolis-Hastings (MH) algorithm (=-=Metropolis et al, 1953-=-; Hastings, 1970) and the Gibbs sampler (Geman and Geman, 1984), tend to get trapped in a local energy minimum indefinitely, rendering the simulation ineffective. This difficulty can be alleviated to ... |

1344 |
Monte Carlo sampling methods using Markov chains and their applications
- Hastings
- 1970
(Show Context)
Citation Context ...e of local energy minima separated by high energy barriers. In simulation from such a distribution, conventional MCMC samplers, such as the Metropolis-Hastings (MH) algorithm (Metropolis et al, 1953; =-=Hastings, 1970-=-) and the Gibbs sampler (Geman and Geman, 1984), tend to get trapped in a local energy minimum indefinitely, rendering the simulation ineffective. This difficulty can be alleviated to some extent by e... |

1238 |
Spatial Interaction and the statistical analysis of lattice systems
- Besag
- 1974
(Show Context)
Citation Context ...s between different modes of the distribution, and a rougher histogram estimate for the marginal density. 5 Bayesian Inference for Spatial Autologistic Models 5.1 Introduction The autologistic model (=-=Besag, 1974-=-) has been widely used for spatial data analysis, e.g., Preisler (1993), Augustin et al, (1996), Wu and Huffer (1997), and Sherman, Apanasovich and Carroll (2006). Let s = {si : i ∈ D} denote the obse... |

919 | Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Green
- 1995
(Show Context)
Citation Context ...specified models; that is, to estimate a marginal mass function defined on a set of points. This problem has been tackled by many authors using a variety of approaches including reversible jump MCMC (=-=Green, 1995-=-; Green and Richardson, 2002), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likel... |

641 |
Multivariate Density Estimation: Theory, Practice, and Visualization
- Scott
- 1992
(Show Context)
Citation Context ... be allowed to vary with the position of the grid point. However, the data-driven approaches, for example, the nearest neighbor and balloon approaches (Loftsgaarden and Quesenberry, 1965; Terrell and =-=Scott, 1992-=-), may not be appropriate to CCMC, because the convergence of bandwidth is out of our control in these approaches. As in conventional kernel density estimation problems, the shape of the kernel has li... |

601 |
A stochastic approximation method
- Robbins, Monro
- 1951
(Show Context)
Citation Context ...he stability of the algorithm. Also, a large value of κ will enable the sampler to reach all subregions very quickly even for a large system. Based on the standard theory of stochastic approximation (=-=Robbins and Monro, 1951-=-; Blum, 1954), Liang (2006) proved that, � �� P lim log �g(t) t→∞ i = c + log Ei � � ψ(x)dx − log(πi) = 1, i = 1, . . . , m, (6) where c is an arbitrary constant which can be determined by imposing an... |

451 |
Kernel Smoothing
- Wand, Jones
- 1995
(Show Context)
Citation Context ...f(x)dx for y ∈ Y. To this end, approximate samples can often be generated from f(x) via MCMC samplers, and the marginal density can then be estimated using the kernel density estimation method (e.g., =-=Wand and Jones, 1995-=-). The kernel density estimation method allows for dependent samples ( Hart and Vieu, 1990; Yu, 1993; Hall, Lahiri and Truong, 1996). When independently and identically distributed samples are availab... |

376 | Marginal likelihood from the gibbs output
- Chib
- 1995
(Show Context)
Citation Context ... and Richardson, 2002), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likelihood (=-=Chib, 1995-=-; Chib and Jeliazkov, 2001; Ishwaran, James and Sun, 2001), among others. The latter problem can be described as follows. Let f(s|θ) = ψ(s, θ)/ϕ(θ) denote the probability mass function of a spatial st... |

373 |
Spatial Statistics
- Ripley
- 1981
(Show Context)
Citation Context ...odel, where s denotes a configuration of the model, and and θ the parameter vector of the model. For some spatial statistical models, e.g., Ising, autologistic, autonormal, and very-soft-core models (=-=Ripley, 1981-=-), the normalizing constant function ϕ(θ) is not available analytically. Evaluating ϕ(θ) can be treated as a problem of marginal density estimation by viewing ψ(s, θ) as an unnormalized joint distribu... |

266 | Bayesian image restoration, with two applications in spatial statistics (with discussion). Ann Inst Stat Math - Besag, York, et al. - 1991 |

221 | Constrained Monte Carlo maximum likelihood for dependent data (with discussion - GEYER, THOMPSON - 1992 |

211 |
Markov chain Monte Carlo maximum likelihood. Computing science and statistics
- Geyer
- 1991
(Show Context)
Citation Context ...apped in a local energy minimum indefinitely, rendering the simulation ineffective. This difficulty can be alleviated to some extent by employing an advanced MCMC sampler, such as parallel tempering (=-=Geyer, 1991-=-), evolutionary Monte Carlo (Liang and Wong, 2001) and the slice sampler (Neal, 2003). In the second case, MCMC samples can only be drawn from the low energy region of X . This difficulty can not be a... |

186 |
A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms
- WEI, TANNER
- 1990
(Show Context)
Citation Context ...isance parameters is high. Missing data problems. Let zobs and zmis denote the observed and missed parts of the data, respectively; and let θ denote the parameter vector of the model. Monte Carlo EM (=-=Wei and Tanner, 1990-=-) provides a method for finding the MLE of θ when the conditional expectation in the “E-step” are not available analytically. Otherwise, the EM algorithm (Dempster et al, 1977) can be used. CCMC can w... |

182 |
Local Regression and Likelihood
- Loader
- 1999
(Show Context)
Citation Context ...d Vieu, 1990; Yu, 1993; Hall, Lahiri and Truong, 1996). When independently and identically distributed samples are available, other nonparametric density estimation methods, such as local likelihood (=-=Loader, 1999-=-), smoothing spline (Gu, 1993; Gu and Qiu, 1993), and logspline (Kooperberg and Stone, 1991; Kooperberg, 1998), are also applicable to estimate the marginal density. The problem has also been tackled ... |

149 | Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling
- Gelman, Meng
- 1998
(Show Context)
Citation Context ...authors using a variety of approaches including reversible jump MCMC (Green, 1995; Green and Richardson, 2002), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (Meng and Wong, 1996; =-=Gelman and Meng, 1998-=-), reverse logistic regression (Geyer, 1994), marginal likelihood (Chib, 1995; Chib and Jeliazkov, 2001; Ishwaran, James and Sun, 2001), among others. The latter problem can be described as follows. L... |

124 |
Marginal Likelihood from the MetropolisHastings
- Chib, Jeliazkov
- 2001
(Show Context)
Citation Context ...son, 2002), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likelihood (Chib, 1995; =-=Chib and Jeliazkov, 2001-=-; Ishwaran, James and Sun, 2001), among others. The latter problem can be described as follows. Let f(s|θ) = ψ(s, θ)/ϕ(θ) denote the probability mass function of a spatial statistical model, where s d... |

119 | Simulating Ratios of Normalizing Constants via a Simple Identity
- Meng, Wong
- 1996
(Show Context)
Citation Context ...been tackled by many authors using a variety of approaches including reversible jump MCMC (Green, 1995; Green and Richardson, 2002), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (=-=Meng and Wong, 1996-=-; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likelihood (Chib, 1995; Chib and Jeliazkov, 2001; Ishwaran, James and Sun, 2001), among others. The latter problem can be ... |

91 |
Modern Applied Statistics with S-Plus. 3rd edn
- Venables, Ripley
- 1999
(Show Context)
Citation Context ...og2(M)) i = 1, 2, (9) where γ ∈ ( 1 2 , 1], and the second term in min{·, ·} is the default bandwidth used in conventional density estimation procedures, e.g., the procedure density(·) in S-PLUS 5.0 (=-=Venables and Ripley, 1999-=-, pp.135). For convenience, we link the choices of {δt} and {ht·} together in this paper, although this is not necessary in theory. With the notations defined above, one iteration of CCMC can be descr... |

75 | Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling - Gelfand, Smith, et al. - 1992 |

75 |
Computing Bayes factors using a generalization of the Savage-Dickey density ratio
- Verdinelli, Wasserman
- 1995
(Show Context)
Citation Context ...or shortcoming with Chen’s methods is its strong dependence on the knowledge of the analytical form of the inverse transformation x = λ −1 (y). Other parametric methods (Gelfand, Smith and Lee, 1992; =-=Verdinelli and Wasserman, 1995-=-) suffer from similar handicaps. 2sThe aforementioned methods have a common feature: they are all sample-based. Hence, they will fail to produce an estimate when identically distributed samples (inclu... |

71 |
2001) Efficient, multiple-range random walk algorithm to calculate the density of states. Phys. Rev. Lett. 86, p. 2050; ibid. (2001) Determining the density of states for classical statistical models: A random walk algorithm to produce a flat histogram
- Wang, Landau
(Show Context)
Citation Context ...clude the paper with a brief discussion on the use of CCMC in some statistical problems other than spatial statistical models. 2 A brief review for contour Monte Carlo The Wang-Landau (WL) algorithm (=-=Wang and Landau, 2001-=-) is a dynamic Monte Carlo algorithm used to calculate the spectral density for a physical system. A remarkable feature of the WL algorithm is that it is not trapped by local energy minima. This featu... |

68 | Fast implementation of nonparametric curve estimators - Fan, Marron - 1994 |

64 |
Multidimensional stochastic approximation methods
- Blum
- 1954
(Show Context)
Citation Context ...ithm. Also, a large value of κ will enable the sampler to reach all subregions very quickly even for a large system. Based on the standard theory of stochastic approximation (Robbins and Monro, 1951; =-=Blum, 1954-=-), Liang (2006) proved that, � �� P lim log �g(t) t→∞ i = c + log Ei � � ψ(x)dx − log(πi) = 1, i = 1, . . . , m, (6) where c is an arbitrary constant which can be determined by imposing an additional ... |

59 | An Autologistic Model for the Spatial Distribution of Wildlife - Augustin, Mugglestone, et al. - 1996 |

58 | Hidden Markov models and disease mapping
- Green, Richardson
- 2002
(Show Context)
Citation Context ...els; that is, to estimate a marginal mass function defined on a set of points. This problem has been tackled by many authors using a variety of approaches including reversible jump MCMC (Green, 1995; =-=Green and Richardson, 2002-=-), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likelihood (Chib, 1995; Chib and ... |

55 | Stochastic Approximation and recursive estimation - Nevel'son, Has'minskii - 1973 |

54 | An Efficient Markov Chain Monte Carlo Method for Distributions with Intractable Normalising Constants - Moller, Pettitt, et al. - 2006 |

54 | Variable Kernel Density Estimation
- Terrell, Scott
- 1992
(Show Context)
Citation Context ...idth hti may be allowed to vary with the position of the grid point. However, the data-driven approaches, for example, the nearest neighbor and balloon approaches (Loftsgaarden and Quesenberry, 1965; =-=Terrell and Scott, 1992-=-), may not be appropriate to CCMC, because the convergence of bandwidth is out of our control in these approaches. As in conventional kernel density estimation problems, the shape of the kernel has li... |

48 |
Logspline density estimation
- STONE, Koo
- 1986
(Show Context)
Citation Context ...d identically distributed samples are available, other nonparametric density estimation methods, such as local likelihood (Loader, 1999), smoothing spline (Gu, 1993; Gu and Qiu, 1993), and logspline (=-=Kooperberg and Stone, 1991-=-; Kooperberg, 1998), are also applicable to estimate the marginal density. The problem has also been tackled by some authors from a different angle. For example, Chen (1994) proposed an importance sam... |

46 |
W.: Real-Parameter Evolutionary Monte Carlo With Applications to Bayesian Mixture Models
- Liang, Wong
(Show Context)
Citation Context ...tely, rendering the simulation ineffective. This difficulty can be alleviated to some extent by employing an advanced MCMC sampler, such as parallel tempering (Geyer, 1991), evolutionary Monte Carlo (=-=Liang and Wong, 2001-=-) and the slice sampler (Neal, 2003). In the second case, MCMC samples can only be drawn from the low energy region of X . This difficulty can not be alleviated by employing advanced MCMC samplers as ... |

45 | Estimating normalizing constants and reweighting mixtures in markov chain monte carlo
- Geyer
- 1993
(Show Context)
Citation Context ...sible jump MCMC (Green, 1995; Green and Richardson, 2002), ratio importance sampling (Chen and Shao, 1997a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (=-=Geyer, 1994-=-), marginal likelihood (Chib, 1995; Chib and Jeliazkov, 2001; Ishwaran, James and Sun, 2001), among others. The latter problem can be described as follows. Let f(s|θ) = ψ(s, θ)/ϕ(θ) denote the probabi... |

43 | Simple boundary correction for kernel density estimation - Jones - 1993 |

37 |
Monte Carlo Statistical Methods (2nd Edition
- Robert, G
- 2004
(Show Context)
Citation Context ... example, CCMC is used to estimate the normalizing constant function ϕ(α, β) at a set of pre-specified parameter points. It seems that the job can also be done by a hybrid MCMC sampler (Müller, 1991; =-=Robert and Casella, 2004-=-, pp.393) through simulation of the joint distribution g(α ′ , β ′ , s) = � 1 ⎧ ⎨ � α′ ⎩ ⎛ � si ⎝ � ⎞⎫ ⎬ sj⎠ ⎭ , (α′ , β ′ ) ∈ Ω ′ , (28) (α ′ ,β ′ )∈Ω ′ ϕ(α′ , β ′ ) exp i∈D si + β′ 2 i∈D j∈N(i) wher... |

36 | Fast computation of multivariate kernel estimators’, j-J-COMPUT-GRAPH-STAT - Wand - 1994 |

35 |
On Monte Carlo methods for estimating ratios of normalizing constants
- Chen, Shao
- 1997
(Show Context)
Citation Context ...ed on a set of points. This problem has been tackled by many authors using a variety of approaches including reversible jump MCMC (Green, 1995; Green and Richardson, 2002), ratio importance sampling (=-=Chen and Shao, 1997-=-a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likelihood (Chib, 1995; Chib and Jeliazkov, 2001; Ishwaran, James and Sun, 2001), ... |

33 |
Slice sampling (with Discussion
- Neal
- 2003
(Show Context)
Citation Context ...his difficulty can be alleviated to some extent by employing an advanced MCMC sampler, such as parallel tempering (Geyer, 1991), evolutionary Monte Carlo (Liang and Wong, 2001) and the slice sampler (=-=Neal, 2003-=-). In the second case, MCMC samples can only be drawn from the low energy region of X . This difficulty can not be alleviated by employing advanced MCMC samplers as explained at the end of Section 5. ... |

30 |
A generic approach to posterior integration and Gibbs sampling
- Müller
- 1991
(Show Context)
Citation Context ...to us. In this example, CCMC is used to estimate the normalizing constant function ϕ(α, β) at a set of pre-specified parameter points. It seems that the job can also be done by a hybrid MCMC sampler (=-=Müller, 1991-=-; Robert and Casella, 2004, pp.393) through simulation of the joint distribution g(α ′ , β ′ , s) = � 1 ⎧ ⎨ � α′ ⎩ ⎛ � si ⎝ � ⎞⎫ ⎬ sj⎠ ⎭ , (α′ , β ′ ) ∈ Ω ′ , (28) (α ′ ,β ′ )∈Ω ′ ϕ(α′ , β ′ ) exp i∈D... |

27 |
Smoothing spline density estimation: Theory
- u, Qiu
- 1993
(Show Context)
Citation Context ...uong, 1996). When independently and identically distributed samples are available, other nonparametric density estimation methods, such as local likelihood (Loader, 1999), smoothing spline (Gu, 1993; =-=Gu and Qiu, 1993-=-), and logspline (Kooperberg and Stone, 1991; Kooperberg, 1998), are also applicable to estimate the marginal density. The problem has also been tackled by some authors from a different angle. For exa... |

27 |
Monte Carlo Statistical Methods. 2nd Ed
- Robert, Casella
- 2004
(Show Context)
Citation Context ...is example, CCMC is used to estimate the normalizing constant function ϕ(α, β) at a set of prespecified parameter points. It seems that the job can also be done by a hybrid MCMC sampler (Müller 1991; =-=Robert and Casella 2004-=-, p. 393) through simulation of the joint distribution g(α ′ ,β ′ 1 , s) = ∑ (α′,β′)∈�′ ϕ(α′ ,β′) exp ⎧ ⎨ ∑ α′ si + ⎩ i∈D β′ ⎛ ∑ si ⎝ 2 i∈D ∑ ⎞⎫ ⎬ sj ⎠ ⎭ j∈N(i) , (α ′ ,β ′ ) ∈ � ′ , (5.10) where � ′ ... |

26 | Maximum Likelihood Estimation for Spatial Models by Markov Chain Monte Carlo Stochastic Approximation - Gu, Zhu - 2001 |

22 |
Smoothing spline density estimation: A dimensionless automatic algorithm
- Gu
- 1993
(Show Context)
Citation Context ...iri and Truong, 1996). When independently and identically distributed samples are available, other nonparametric density estimation methods, such as local likelihood (Loader, 1999), smoothing spline (=-=Gu, 1993-=-; Gu and Qiu, 1993), and logspline (Kooperberg and Stone, 1991; Kooperberg, 1998), are also applicable to estimate the marginal density. The problem has also been tackled by some authors from a differ... |

22 | Bayesian Approach to Image Restoration With an Application in Biogeography - Heikkinen, Högmander - 1994 |

21 | Importance-Weighted Marginal Bayesian Posterior Density Estimation - Chen - 1994 |

21 | Bayesian model selection in finite mixtures by marginal density decompositions - Ishwaran, James, et al. |

21 | Efficient recursions for general factorisable models - Reeves, Pettitt - 2004 |

16 | Estimating Ratios of Normalizing Constants for Densities With Different Dimensions
- Chen, Shao
- 1997
(Show Context)
Citation Context ...ed on a set of points. This problem has been tackled by many authors using a variety of approaches including reversible jump MCMC (Green, 1995; Green and Richardson, 2002), ratio importance sampling (=-=Chen and Shao, 1997-=-a,b), path sampling (Meng and Wong, 1996; Gelman and Meng, 1998), reverse logistic regression (Geyer, 1994), marginal likelihood (Chib, 1995; Chib and Jeliazkov, 2001; Ishwaran, James and Sun, 2001), ... |

15 | Mollié A - Besag, York - 1991 |