Results 1  10
of
103
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 170 (3 self)
 Add to MetaCart
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
SpaceAlternating Generalized ExpectationMaximization Algorithm
 IEEE Trans. Signal Processing
, 1994
"... The expectationmaximization (EM) method can facilitate maximizing likelihood functions that arise in statistical estimation problems. In the classical EM paradigm, one iteratively maximizes the conditional loglikelihood of a single unobservable complete data space, rather than maximizing the intra ..."
Abstract

Cited by 127 (26 self)
 Add to MetaCart
The expectationmaximization (EM) method can facilitate maximizing likelihood functions that arise in statistical estimation problems. In the classical EM paradigm, one iteratively maximizes the conditional loglikelihood of a single unobservable complete data space, rather than maximizing the intractable likelihood function for the measured or incomplete data. EM algorithms update all parameters simultaneously, which has two drawbacks: 1) slow convergence, and 2) difficult maximization steps due to coupling when smoothness penalties are used. This paper describes the spacealternating generalized EM (SAGE) method, which updates the parameters sequentially by alternating between several small hiddendata spaces defined by the algorithm designer. We prove that the sequence of estimates monotonically increases the penalizedlikelihood objective, we derive asymptotic convergence rates, and we provide sufficient conditions for monotone convergence in norm. Two signal processing applicatio...
Convergence of a stochastic approximation version of the EM algorithm
, 1997
"... The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions... ..."
Abstract

Cited by 86 (8 self)
 Add to MetaCart
The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions...
On the Convergence of Monte Carlo Maximum Likelihood Calculations
 Journal of the Royal Statistical Society B
, 1992
"... Monte Carlo maximum likelihood for normalized families of distributions (Geyer and Thompson, 1992) can be used for an extremely broad class of models. Given any family f h ` : ` 2 \Theta g of nonnegative integrable functions, maximum likelihood estimates in the family obtained by normalizing the the ..."
Abstract

Cited by 59 (3 self)
 Add to MetaCart
Monte Carlo maximum likelihood for normalized families of distributions (Geyer and Thompson, 1992) can be used for an extremely broad class of models. Given any family f h ` : ` 2 \Theta g of nonnegative integrable functions, maximum likelihood estimates in the family obtained by normalizing the the functions to integrate to one can be approximated by Monte Carlo, the only regularity conditions being a compactification of the parameter space such that the the evaluation maps ` 7! h ` (x) remain continuous. Then with probability one the Monte Carlo approximant to the log likelihood hypoconverges to the exact log likelihood, its maximizer converges to the exact maximum likelihood estimate, approximations to profile likelihoods hypoconverge to the exact profile, and level sets of the approximate likelihood (support regions) converge to the exact sets (in Painlev'eKuratowski set convergence). The same results hold when there are missing data (Thompson and Guo, 1991, Gelfand and Carlin, 19...
Exact and computationally efficient likelihoodbased estimation for discretely observed diffusion processes
 Journal of the Royal Statistical Society, Series B: Statistical Methodology
, 2006
"... The objective of this paper is to present a novel methodology for likelihoodbased inference for discretely observed diffusions. We propose Monte Carlo methods, which build on recent advances on the exact simulation of diffusions, for performing maximum likelihood and Bayesian estimation. ..."
Abstract

Cited by 58 (12 self)
 Add to MetaCart
The objective of this paper is to present a novel methodology for likelihoodbased inference for discretely observed diffusions. We propose Monte Carlo methods, which build on recent advances on the exact simulation of diffusions, for performing maximum likelihood and Bayesian estimation.
Mixtures of gpriors for Bayesian variable selection
 Journal of the American Statistical Association
, 2008
"... Zellner’s gprior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency issues. In this paper, we study mixtures of gpriors as an alternative to default gpriors that resolve many of the problems with the original formulation, while mai ..."
Abstract

Cited by 36 (4 self)
 Add to MetaCart
Zellner’s gprior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency issues. In this paper, we study mixtures of gpriors as an alternative to default gpriors that resolve many of the problems with the original formulation, while maintaining the computational tractability that has made the gprior so popular. We present theoretical properties of the mixture gpriors and provide real and simulated examples to compare the mixture formulation with fixed gpriors, Empirical Bayes approaches and other default procedures.
Accelerating EM for large databases
 Machine Learning
, 2001
"... The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial Esteps for which we can use the results of Neal and Hinton (1998) to obtain the standard convergence guarantees of EM. The rst approach is a version of the incremental EM, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically e ects the e ciency of the algorithm. We provide a method for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the signi cance of each data case and then proceed for several iterations actively using only the signi cant cases. We demonstrate that both methods can signi cantly reduce computational costs through their application to highdimensional realworld and synthetic mixture modeling problems for large databases. Keywords: Expectation Maximization Algorithm, incremental EM, lazy EM, online EM, data blocking, mixture models, clustering.
Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime
 ANN. STATIST
, 2004
"... An autoregressive process with Markov regime is an autoregressive process for which the regression function at each time point is given by a nonobservable Markov chain. In this paper we consider the asymptotic properties of the maximum likelihood estimator in a possibly nonstationary process of this ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
An autoregressive process with Markov regime is an autoregressive process for which the regression function at each time point is given by a nonobservable Markov chain. In this paper we consider the asymptotic properties of the maximum likelihood estimator in a possibly nonstationary process of this kind for which the hidden state space is compact but not necessarily finite. Consistency and asymptotic normality are shown to follow from uniform exponential forgetting of the initial distribution for the hidden Markov chain conditional on the observations.
A Componentwise EM Algorithm for Mixtures
, 1999
"... In some situations, EM algorithm shows slow convergence problems. One possible reason is that standard procedures update the parameters simultaneously. In this paper we focus on nite mixture estimation. In this framework, we propose a componentwise EM, which updates the parameters sequentially. We ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
In some situations, EM algorithm shows slow convergence problems. One possible reason is that standard procedures update the parameters simultaneously. In this paper we focus on nite mixture estimation. In this framework, we propose a componentwise EM, which updates the parameters sequentially. We give an interpretation of this procedure as a proximal point algorithm and use it to prove the convergence. Illustrative numerical experiments show how our algorithm compares to EM and a version of the SAGE algorithm.
Accelerated Quantification of Bayesian Networks with Incomplete Data
 In Proceedings of First International Conference on Knowledge Discovery and Data Mining
, 1995
"... Probabilistic expert systems based on Bayesian networks (BNs) require initial specification of both a qualitative graphical structure and quantitative assessment of conditional probability tables. This paper considers statistical batch learning of the probability tables on the basis of incomple ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
Probabilistic expert systems based on Bayesian networks (BNs) require initial specification of both a qualitative graphical structure and quantitative assessment of conditional probability tables. This paper considers statistical batch learning of the probability tables on the basis of incomplete data and expert knowledge. The EM algorithm with a generalized conjugate gradient acceleration method has been dedicated to quantification of BNs by maximum posterior likelihood estimation for a superclass of the recursive graphical models. This new class of models allows a great variety of local functional restrictions to be imposed on the statistical model, which hereby extents the control and applicability of the constructed method for quantifying BNs. Introduction The construction of probabilistic expert systems (Pearl 1988, Andreassen et al. 1989) based on Bayesian networks (BNs) is often a challenging process. It is typically divided into two parts: First the constructi...