Results 1  10
of
158
Analysis of multivariate probit models
 BIOMETRIKA
, 1998
"... This paper provides a practical simulationbased Bayesian and nonBayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the ..."
Abstract

Cited by 101 (6 self)
 Add to MetaCart
This paper provides a practical simulationbased Bayesian and nonBayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the EM algorithm. A practical approach for the computation of Bayes factors from the simulation output is also developed. The methods are applied to a dataset with a bivariate binary response, to a fouryear longitudinal dataset from the Six Cities study of the health effects of air pollution and to a sevenvariate binary response dataset on the labour supply of married women from the Panel Survey of Income Dynamics.
Penalized MaximumLikelihood Image Reconstruction using SpaceAlternating Generalized EM Algorithms
 IEEE Tr. Im. Proc
, 1995
"... Most expectationmaximization (EM) type algorithms for penalized maximumlikelihood image reconstruction converge slowly, particularly when one incorporates additive background effects such as scatter, random coincidences, dark current, or cosmic radiation. In addition, regularizing smoothness penal ..."
Abstract

Cited by 82 (30 self)
 Add to MetaCart
Most expectationmaximization (EM) type algorithms for penalized maximumlikelihood image reconstruction converge slowly, particularly when one incorporates additive background effects such as scatter, random coincidences, dark current, or cosmic radiation. In addition, regularizing smoothness penalties (or priors) introduce parameter coupling, rendering intractable the Msteps of most EMtype algorithms. This paper presents spacealternating generalized EM (SAGE) algorithms for image reconstruction, which update the parameters sequentially using a sequence of small "hidden" data spaces, rather than simultaneously using one large completedata space. The sequential update decouples the Mstep, so the maximization can typically be performed analytically. We introduce new hiddendata spaces that are less informative than the conventional completedata space for Poisson data and that yield significant improvements in convergence rate. This acceleration is due to statistical considerations, not numerical overrelaxation methods, so monotonic increases in the objective function are guaranteed. We provide a general global convergence proof for SAGE methods with nonnegativity constraints.
A tutorial on MM algorithms
 Amer. Statist
, 2004
"... Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood estimation because they consistently drive the likelihood uphill by maximizing a simple surrogate function ..."
Abstract

Cited by 66 (3 self)
 Add to MetaCart
Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood estimation because they consistently drive the likelihood uphill by maximizing a simple surrogate function for the loglikelihood. Iterative optimization of a surrogate function as exemplified by an EM algorithm does not necessarily require missing data. Indeed, every EM algorithm is a special case of the more general class of MM optimization algorithms, which typically exploit convexity rather than missing data in majorizing or minorizing an objective function. In our opinion, MM algorithms deserve to part of the standard toolkit of professional statisticians. The current article explains the principle behind MM algorithms, suggests some methods for constructing them, and discusses some of their attractive features. We include numerous examples throughout the article to illustrate the concepts described. In addition to surveying previous work on MM algorithms, this article introduces some new material on constrained optimization and standard error estimation. Key words and phrases: constrained optimization, EM algorithm, majorization, minorization, NewtonRaphson 1 1
Maximum Conditional Likelihood via Bound Maximization and the CEM Algorithm
 IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11
, 1998
"... We present the CEM (Conditional Expectation Maximization) algorithm as an extension of the EM (Expectation Maximization) algorithm to conditional density estimation under missing data. A bounding and maximization process is given to specifically optimize conditional likelihood instead of the usual j ..."
Abstract

Cited by 54 (8 self)
 Add to MetaCart
We present the CEM (Conditional Expectation Maximization) algorithm as an extension of the EM (Expectation Maximization) algorithm to conditional density estimation under missing data. A bounding and maximization process is given to specifically optimize conditional likelihood instead of the usual joint likelihood. Weapply the method to conditioned mixture models and use bounding techniques to derive the model's update rules. Monotonic convergence, computational efficiency and regression results superior to EM are demonstrated.
Multiple imputation for multivariate missingdata problems: a data analyst's perspective
 Multivariate Behavioral Research
, 1998
"... Analyses of multivariate data are frequently hampered by missing values. Until recently, the only missingdata methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and computational statistics, however, hav ..."
Abstract

Cited by 47 (1 self)
 Add to MetaCart
Analyses of multivariate data are frequently hampered by missing values. Until recently, the only missingdata methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and computational statistics, however, have produced a new generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simulation technique that replaces each missing datum with a set of m>1 plausible values. The m versions of the complete data are analyzed by standard completedata methods, and the results are combined using simple rules to yield estimates, standard errors, and pvalues that formally incorporate missingdata uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from
Parameter expansion to accelerate EM: The PXEM algorithm
, 1998
"... The EM algorithm and its extensions are popular tools for modal estimation but are often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment ' to correct the analysis of the M step, capitalising on e ..."
Abstract

Cited by 35 (7 self)
 Add to MetaCart
The EM algorithm and its extensions are popular tools for modal estimation but are often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment ' to correct the analysis of the M step, capitalising on extra information captured in the imputed complete data. The way we accomplish this is by parameter expansion; we expand the completedata model while preserving the observeddata model and use the expanded completedata model to generate EM. This parameterexpanded EM, PXEM, algorithm shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis. The PXEM algorithm is illustrated for the multivariate t distribution, a random effects model, factor analysis, probit regression and a Poisson imaging model.
Accelerating EM for large databases
 Machine Learning
, 2001
"... The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial Esteps for which we can use the results of Neal and Hinton (1998) to obtain the standard convergence guarantees of EM. The rst approach is a version of the incremental EM, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically e ects the e ciency of the algorithm. We provide a method for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the signi cance of each data case and then proceed for several iterations actively using only the signi cant cases. We demonstrate that both methods can signi cantly reduce computational costs through their application to highdimensional realworld and synthetic mixture modeling problems for large databases. Keywords: Expectation Maximization Algorithm, incremental EM, lazy EM, online EM, data blocking, mixture models, clustering.
Inpainting and zooming using sparse representations
 The Computer Journal
"... Representing the image to be inpainted in an appropriate sparse representation dictionary, and combining elements from Bayesian statistics and modern harmonic analysis, we introduce an expectation maximization (EM) algorithm for image inpainting and interpolation. From a statistical point of view, t ..."
Abstract

Cited by 34 (8 self)
 Add to MetaCart
Representing the image to be inpainted in an appropriate sparse representation dictionary, and combining elements from Bayesian statistics and modern harmonic analysis, we introduce an expectation maximization (EM) algorithm for image inpainting and interpolation. From a statistical point of view, the inpainting/interpolation can be viewed as an estimation problem with missing data. Toward this goal, we propose the idea of using the EM mechanism in a Bayesian framework, where a sparsity promoting prior penalty is imposed on the reconstructed coefficients. The EM framework gives a principled way to establish formally the idea that missing samples can be recovered/ interpolated based on sparse representations. We first introduce an easy and efficient sparserepresentationbased iterative algorithm for image inpainting. Additionally, we derive its theoretical convergence properties. Compared to its competitors, this algorithm allows a high degree of flexibility to recover different structural components in the image (piecewise smooth, curvilinear, texture, etc.). We also suggest some guidelines to automatically tune the regularization parameter.
Discriminative, Generative and Imitative Learning
, 2002
"... I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specif ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specific knowledge in terms of structure and parameter priors over the joint space of variables. Bayesian networks and Bayesian statistics provide a rich and flexible language for specifying this knowledge and subsequently refining it with data and observations. The final result is a distribution that is a good generator of novel exemplars.