Results 1  10
of
262
A gentle tutorial on the EM algorithm and its application to parameter estimation for gaussian mixture and hidden markov models
, 1997
"... We describe the maximumlikelihood parameter estimation problem and how the Expectationform of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) fi ..."
Abstract

Cited by 477 (4 self)
 Add to MetaCart
We describe the maximumlikelihood parameter estimation problem and how the Expectationform of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) finding the parameters of a hidden Markov model (HMM) (i.e., the BaumWelch algorithm) for both discrete and Gaussian mixture observation models. We derive the update equations in fairly explicit detail but we do not prove any convergence properties. We try to emphasize intuition rather than mathematical rigor. ii 1 Maximumlikelihood Recall the definition of the maximumlikelihood estimation problem. We have a density function ¢¡¤£¦ ¥ §© ¨ that is governed by the set of parameters § (e.g., might be a set of Gaussians and § could be the means and covariances). We also have a data set of size � , supposedly drawn from this distribution, i.e., ���� � £�������������£��© �. That is, we assume that these data vectors are independent and
How many clusters? Which clustering method? Answers via modelbased cluster analysis
 THE COMPUTER JOURNAL
, 1998
"... ..."
Segmentation of brain MR images through a hidden Markov random field model and the expectationmaximization algorithm
 IEEE Transactions on Medical. Imaging
, 2001
"... Abstract—The finite mixture (FM) model is the most commonly used model for statistical segmentation of brain magnetic resonance (MR) images because of its simple mathematical form and the piecewise constant nature of ideal brain MR images. However, being a histogrambased model, the FM has an intrin ..."
Abstract

Cited by 256 (10 self)
 Add to MetaCart
Abstract—The finite mixture (FM) model is the most commonly used model for statistical segmentation of brain magnetic resonance (MR) images because of its simple mathematical form and the piecewise constant nature of ideal brain MR images. However, being a histogrambased model, the FM has an intrinsic limitation—no spatial information is taken into account. This causes the FM model to work only on welldefined images with low levels of noise; unfortunately, this is often not the the case due to artifacts such as partial volume effect and bias field distortion. Under these conditions, FM modelbased methods produce unreliable results. In this paper, we propose a novel hidden Markov random field (HMRF) model, which is a stochastic process generated by a MRF whose state sequence cannot be observed directly but which can be indirectly estimated through observations. Mathematically, it can be shown that the FM model is a degenerate version of the HMRF model. The advantage of the HMRF model derives from the way in which the spatial information is encoded through the mutual influences of neighboring sites. Although MRF modeling has been employed in MR image segmentation by other researchers, most reported methods are limited to using MRF as a general prior in an FM modelbased approach. To fit the HMRF model, an EM algorithm is used. We show that by incorporating both the HMRF model and the EM algorithm into a HMRFEM framework, an accurate and robust segmentation can be achieved. More importantly, the HMRFEM framework can easily be combined with other techniques. As an example, we show how the bias field correction algorithm of Guillemaud and Brady (1997) can be incorporated into this framework to achieve a threedimensional fully automated approach for brain MR image segmentation. Index Terms—Bias field correction, expectationmaximization, hidden Markov random field, MRI, segmentation. I.
An EM Algorithm for WaveletBased Image Restoration
, 2002
"... This paper introduces an expectationmaximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with lowcomplexity, expressed in terms of the wavelet coecients, taking a ..."
Abstract

Cited by 236 (21 self)
 Add to MetaCart
This paper introduces an expectationmaximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with lowcomplexity, expressed in terms of the wavelet coecients, taking advantage of the well known sparsity of wavelet representations. Previous works have investigated waveletbased restoration but, except for certain special cases, the resulting criteria are solved approximately or require very demanding optimization methods. The EM algorithm herein proposed combines the efficient image representation oered by the discrete wavelet transform (DWT) with the diagonalization of the convolution operator obtained in the Fourier domain. The algorithm alternates between an Estep based on the fast Fourier transform (FFT) and a DWTbased Mstep, resulting in an ecient iterative process requiring O(N log N) operations per iteration. Thus, it is the rst image restoration algorithm that optimizes a waveletbased penalized likelihood criterion and has computational complexity comparable to that of standard wavelet denoising or frequency domain deconvolution methods. The convergence behavior of the algorithm is investigated, and it is shown that under mild conditions the algorithm converges to a globally optimal restoration. Moreover, our new approach outperforms several of the best existing methods in benchmark tests, and in some cases is also much less computationally demanding.
A Multiscale Random Field Model for Bayesian Image Segmentation
, 1996
"... Many approaches to Bayesian image segmentation have used maximum a posteriori (MAP) estimation in conjunction with Markov random fields (MRF). While this approach performs well, it has a number of disadvantages. In particular, exact MAP estimates cannot be computed, approximate MAP estimates are com ..."
Abstract

Cited by 234 (18 self)
 Add to MetaCart
Many approaches to Bayesian image segmentation have used maximum a posteriori (MAP) estimation in conjunction with Markov random fields (MRF). While this approach performs well, it has a number of disadvantages. In particular, exact MAP estimates cannot be computed, approximate MAP estimates are computationally expensive to compute, and unsupervised parameter estimation of the MRF is difficult. In this paper, we propose a new approach to Bayesian image segmentation which directly addresses these problems. The new method replaces the MRF model with a novel multiscale random field (MSRF), and replaces the MAP estimator with a sequential MAP (SMAP) estimator derived from a novel estimation criteria. Together, the proposed estimator and model result in a segmentation algorithm which is not iterative and can be computed in time proportional to MN where M is the number of classes and N is the number of pixels. We also develop a computationally effcient method for unsupervised estimation of m...
On Convergence Properties of the EM Algorithm for Gaussian Mixtures
 Neural Computation
, 1995
"... We build up the mathematical connection between the "ExpectationMaximization" (EM) algorithm and gradientbased approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix P,andwe provide ..."
Abstract

Cited by 144 (13 self)
 Add to MetaCart
We build up the mathematical connection between the "ExpectationMaximization" (EM) algorithm and gradientbased approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix P,andwe provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of P and provide new results analyzing the effect that P has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of Gaussian mixture models.
Automated modelbased tissue classification of MR images of the brain
, 1999
"... We describe a fully automated method for modelbased tissue classification of Magnetic Resonance (MR) images of the brain. The method interleaves classification with estimation of the model parameters, improving the classification at each iteration. The algorithm is able to segment single and multi ..."
Abstract

Cited by 143 (13 self)
 Add to MetaCart
We describe a fully automated method for modelbased tissue classification of Magnetic Resonance (MR) images of the brain. The method interleaves classification with estimation of the model parameters, improving the classification at each iteration. The algorithm is able to segment single and multispectral MR images, corrects for MR signal inhomogeneities and incorporates contextual information by means of Markov Random Fields. A digital brain atlas containing prior expectations about the spatial location of tissue classes is used to initialize the algorithm. This makes the method fully automated and therefore provides objective and reproducible segmentations. We have validated the technique on simulated as well as on real MR images of the brain.
Continuous Probabilistic Transform for Voice Conversion
 IEEE Transactions on Speech and Audio Processing
, 1998
"... Abstract — Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the r ..."
Abstract

Cited by 129 (4 self)
 Add to MetaCart
Abstract — Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mixture model of the source speaker spectral envelopes. The conversion itself is represented by a continuous parametric function which takes into account the probabilistic classification provided by the mixture model. The parameters of the conversion function are estimated by least squares optimization on the training data. This conversion method is implemented in the context of the HNM (harmonic C noise model) system, which allows highquality modifications of speech signals. Compared to earlier methods based on vector quantization, the proposed conversion scheme results in a much better match between the converted envelopes and the target envelopes. Evaluation by objective tests and formal listening tests shows that the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods. I.
Simulating ratios of normalizing constants via a simple identity: A theoretical exploration
 Statistica Sinica
, 1996
"... Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. ..."
Abstract

Cited by 111 (4 self)
 Add to MetaCart
Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in fields such as physics and genetics. Many methods proposed in statistical and other literature (e.g., computational physics) for dealing with this problem are based on various special cases of the following simple identity: c1 c2 = E2[q1(w)α(w)] E1[q2(w)α(w)]. Here Ei denotes the expectation with respect to pi (i =1, 2), and α is an arbitrary function such that the denominator is nonzero. A main purpose of this paper is to provide a theoretical study of the usefulness of this identity, with focus on (asymptotically) optimal and practical choices of α. Using a simple but informative example, we demonstrate that with sensible (not necessarily optimal) choices of α, we can reduce the simulation error by orders of magnitude when compared to the conventional importance sampling method, which corresponds to α =1/q2. We also introduce several generalizations of this identity for handling more complicated settings (e.g., estimating several ratios simultaneously) and pose several open problems that appear to have practical as well as theoretical value. Furthermore, we discuss related theoretical and empirical work.
A New TwIST: TwoStep Iterative Shrinkage/Thresholding Algorithms for Image Restoration
 IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2007
"... Iterative shrinkage/thresholding (IST) algorithms have been recently proposed to handle a class of convex unconstrained optimization problems arising in image restoration and other linear inverse problems. This class of problems results from combining a linear observation model with a nonquadratic ..."
Abstract

Cited by 98 (19 self)
 Add to MetaCart
Iterative shrinkage/thresholding (IST) algorithms have been recently proposed to handle a class of convex unconstrained optimization problems arising in image restoration and other linear inverse problems. This class of problems results from combining a linear observation model with a nonquadratic regularizer (e.g., total variation or waveletbased regularization). It happens that the convergence rate of these IST algorithms depends heavily on the linear observation operator, becoming very slow when this operator is illconditioned or illposed. In this paper, we introduce twostep IST (TwIST) algorithms, exhibiting much faster convergence rate than IST for illconditioned problems. For a vast class of nonquadratic convex regularizers ( norms, some Besov norms, and total variation), we show that TwIST converges to a minimizer of the objective function, for a given range of values of its parameters. For noninvertible observation operators, we introduce a monotonic version of TwIST (MTwIST); although the convergence proof does not apply to this scenario, we give experimental evidence that MTwIST exhibits similar speed gains over IST. The effectiveness of the new methods are experimentally confirmed on problems of image deconvolution and of restoration with missing samples.