Results 1 - 10
of
185
A gentle tutorial on the EM algorithm and its application to parameter estimation for gaussian mixture and hidden markov models
, 1997
"... We describe the maximum-likelihood parameter estimation problem and how the Expectation-form of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) fi ..."
Abstract
-
Cited by 328 (4 self)
- Add to MetaCart
We describe the maximum-likelihood parameter estimation problem and how the Expectation-form of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) finding the parameters of a hidden Markov model (HMM) (i.e., the Baum-Welch algorithm) for both discrete and Gaussian mixture observation models. We derive the update equations in fairly explicit detail but we do not prove any convergence properties. We try to emphasize intuition rather than mathematical rigor. ii 1 Maximum-likelihood Recall the definition of the maximum-likelihood estimation problem. We have a density function ¢¡¤£¦ ¥ §© ¨ that is governed by the set of parameters § (e.g., might be a set of Gaussians and § could be the means and covariances). We also have a data set of size � , supposedly drawn from this distribution, i.e., ���� � £�������������£��© �. That is, we assume that these data vectors are independent and
How many clusters? Which clustering method? Answers via model-based cluster analysis
- THE COMPUTER JOURNAL
, 1998
"... ..."
A Multiscale Random Field Model for Bayesian Image Segmentation
, 1996
"... Many approaches to Bayesian image segmentation have used maximum a posteriori (MAP) estimation in conjunction with Markov random fields (MRF). While this approach performs well, it has a number of disadvantages. In particular, exact MAP estimates cannot be computed, approximate MAP estimates are com ..."
Abstract
-
Cited by 199 (19 self)
- Add to MetaCart
Many approaches to Bayesian image segmentation have used maximum a posteriori (MAP) estimation in conjunction with Markov random fields (MRF). While this approach performs well, it has a number of disadvantages. In particular, exact MAP estimates cannot be computed, approximate MAP estimates are computationally expensive to compute, and unsupervised parameter estimation of the MRF is difficult. In this paper, we propose a new approach to Bayesian image segmentation which directly addresses these problems. The new method replaces the MRF model with a novel multiscale random field (MSRF), and replaces the MAP estimator with a sequential MAP (SMAP) estimator derived from a novel estimation criteria. Together, the proposed estimator and model result in a segmentation algorithm which is not iterative and can be computed in time proportional to MN where M is the number of classes and N is the number of pixels. We also develop a computationally effcient method for unsupervised estimation of m...
Segmentation of brain MR images through a hidden Markov random field model and the expectationmaximization algorithm
- IEEE Transactions on Medical. Imaging
, 2001
"... Abstract—The finite mixture (FM) model is the most commonly used model for statistical segmentation of brain magnetic resonance (MR) images because of its simple mathematical form and the piecewise constant nature of ideal brain MR images. However, being a histogram-based model, the FM has an intrin ..."
Abstract
-
Cited by 151 (8 self)
- Add to MetaCart
Abstract—The finite mixture (FM) model is the most commonly used model for statistical segmentation of brain magnetic resonance (MR) images because of its simple mathematical form and the piecewise constant nature of ideal brain MR images. However, being a histogram-based model, the FM has an intrinsic limitation—no spatial information is taken into account. This causes the FM model to work only on well-defined images with low levels of noise; unfortunately, this is often not the the case due to artifacts such as partial volume effect and bias field distortion. Under these conditions, FM model-based methods produce unreliable results. In this paper, we propose a novel hidden Markov random field (HMRF) model, which is a stochastic process generated by a MRF whose state sequence cannot be observed directly but which can be indirectly estimated through observations. Mathematically, it can be shown that the FM model is a degenerate version of the HMRF model. The advantage of the HMRF model derives from the way in which the spatial information is encoded through the mutual influences of neighboring sites. Although MRF modeling has been employed in MR image segmentation by other researchers, most reported methods are limited to using MRF as a general prior in an FM model-based approach. To fit the HMRF model, an EM algorithm is used. We show that by incorporating both the HMRF model and the EM algorithm into a HMRF-EM framework, an accurate and robust segmentation can be achieved. More importantly, the HMRF-EM framework can easily be combined with other techniques. As an example, we show how the bias field correction algorithm of Guillemaud and Brady (1997) can be incorporated into this framework to achieve a three-dimensional fully automated approach for brain MR image segmentation. Index Terms—Bias field correction, expectation-maximization, hidden Markov random field, MRI, segmentation. I.
An EM Algorithm for Wavelet-Based Image Restoration
, 2002
"... This paper introduces an expectation-maximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with low-complexity, expressed in terms of the wavelet coecients, taking a ..."
Abstract
-
Cited by 149 (14 self)
- Add to MetaCart
This paper introduces an expectation-maximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with low-complexity, expressed in terms of the wavelet coecients, taking advantage of the well known sparsity of wavelet representations. Previous works have investigated wavelet-based restoration but, except for certain special cases, the resulting criteria are solved approximately or require very demanding optimization methods. The EM algorithm herein proposed combines the efficient image representation oered by the discrete wavelet transform (DWT) with the diagonalization of the convolution operator obtained in the Fourier domain. The algorithm alternates between an E-step based on the fast Fourier transform (FFT) and a DWT-based M-step, resulting in an ecient iterative process requiring O(N log N) operations per iteration. Thus, it is the rst image restoration algorithm that optimizes a wavelet-based penalized likelihood criterion and has computational complexity comparable to that of standard wavelet denoising or frequency domain deconvolution methods. The convergence behavior of the algorithm is investigated, and it is shown that under mild conditions the algorithm converges to a globally optimal restoration. Moreover, our new approach outperforms several of the best existing methods in benchmark tests, and in some cases is also much less computationally demanding.
On Convergence Properties of the EM Algorithm for Gaussian Mixtures
- Neural Computation
, 1995
"... We build up the mathematical connection between the "Expectation-Maximization" (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix P,andwe provide ..."
Abstract
-
Cited by 115 (12 self)
- Add to MetaCart
We build up the mathematical connection between the "Expectation-Maximization" (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix P,andwe provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of P and provide new results analyzing the effect that P has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of Gaussian mixture models.
Convergence results for the EM Approach to Mixtures of Experts Architectures
- NEURAL NETWORKS
, 1995
"... The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architectur ..."
Abstract
-
Cited by 89 (6 self)
- Add to MetaCart
The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments.
Continuous Probabilistic Transform for Voice Conversion
- IEEE Transactions on Speech and Audio Processing
, 1998
"... Abstract — Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the r ..."
Abstract
-
Cited by 78 (3 self)
- Add to MetaCart
Abstract — Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mixture model of the source speaker spectral envelopes. The conversion itself is represented by a continuous parametric function which takes into account the probabilistic classification provided by the mixture model. The parameters of the conversion function are estimated by least squares optimization on the training data. This conversion method is implemented in the context of the HNM (harmonic C noise model) system, which allows high-quality modifications of speech signals. Compared to earlier methods based on vector quantization, the proposed conversion scheme results in a much better match between the converted envelopes and the target envelopes. Evaluation by objective tests and formal listening tests shows that the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods. I.
Simulating ratios of normalizing constants via a simple identity: A theoretical exploration
- Statistica Sinica
, 1996
"... Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. ..."
Abstract
-
Cited by 76 (3 self)
- Add to MetaCart
Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in fields such as physics and genetics. Many methods proposed in statistical and other literature (e.g., computational physics) for dealing with this problem are based on various special cases of the following simple identity: c1 c2 = E2[q1(w)α(w)] E1[q2(w)α(w)]. Here Ei denotes the expectation with respect to pi (i =1, 2), and α is an arbitrary function such that the denominator is non-zero. A main purpose of this paper is to provide a theoretical study of the usefulness of this identity, with focus on (asymptotically) optimal and practical choices of α. Using a simple but informative example, we demonstrate that with sensible (not necessarily optimal) choices of α, we can reduce the simulation error by orders of magnitude when compared to the conventional importance sampling method, which corresponds to α =1/q2. We also introduce several generalizations of this identity for handling more complicated settings (e.g., estimating several ratios simultaneously) and pose several open problems that appear to have practical as well as theoretical value. Furthermore, we discuss related theoretical and empirical work.
Automated model-based bias field correction of MR images of the brain
- IEEE TRANSACTIONS ON MEDICAL IMAGING
, 1999
"... ..."

