Results 1  10
of
25
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 174 (3 self)
 Add to MetaCart
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
The consistency of the BIC Markov order estimator.
"... . The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
. The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj k (jAj\Gamma1) 2 log n: We show that k equals the correct order of the chain, eventually almost surely as n ! 1, thereby strengthening earlier consistency results that assumed an apriori bound on the order. A key tool is a strong ratiotypicality result for Markov sample paths. We also show that the Bayesian estimator or minimum description length estimator, of which the BIC estimator is an approximation, fails to be consistent for the uniformly distributed i.i.d. process. AMS 1991 subject classification: Primary 62F12, 62M05; Secondary 62F13, 60J10 Key words and phrases: Bayesian Information Criterion, order estimation, ratiotypicality, Markov chains. 1 Supported in part by a joint N...
1 Nonparametric Statistical Inference for Ergodic Processes
"... Abstract—In this work a method for statistical analysis of time series is proposed, which is used to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, three problems are considered: goo ..."
Abstract

Cited by 14 (14 self)
 Add to MetaCart
Abstract—In this work a method for statistical analysis of time series is proposed, which is used to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, three problems are considered: goodnessoffit (or identity) testing, process classification, and the change point problem. For each of the problems a test is constructed that is asymptotically accurate for the case when the data is generated by stationary ergodic processes. The tests are based on empirical estimates of distributional distance. Index Terms—Nonparametric hypothesis testing, stationary ergodic processes, goodnessoffit test, process classification, change point problem. I.
Testing composite hypotheses about discretevalued stationary processes
 In ITW : 291– 295
"... processes ..."
On hypotheses testing for ergodic processes
 In Proceedgings of Information Theory Workshop (2008
, 1998
"... We propose a method for statistical analysis of time series, that allows us to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, we consider three problems: goodnessoffit (or identity ..."
Abstract

Cited by 11 (11 self)
 Add to MetaCart
We propose a method for statistical analysis of time series, that allows us to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, we consider three problems: goodnessoffit (or identity) testing, process classification, and the change point problem. For each of the problems we construct a test that is asymptotically accurate for the case when the data is generated by stationary ergodic processes. The tests are based on empirical estimates of distributional distance.
NUMBER OF HIDDEN STATES AND MEMORY: A JOINT ORDER ESTIMATION PROBLEM FOR MARKOV CHAINS WITH MARKOV REGIME
"... Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study th ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. This paper deals with order identification for Markov chains with Markov regime (MCMR) in the context of finite alphabets. We define the joint order of a MCMR process in terms of the number k of states of the hidden Markov chain and the memory m of the conditional Markov chain. We study the properties of penalized maximum likelihood estimators for the unknown order (k, m) of an observed MCMR process, relying on information theoretic arguments. The novelty of our work relies in the joint estimation of two structural parameters. Furthermore, the different models in competition are not nested. In an asymptotic framework, we prove that a penalized maximum likelihood estimator is strongly consistent without prior bounds on k and m. We complement our theoretical work with a simulation study of its behaviour. We also study numerically the behaviour of the BIC criterion. A theoretical proof of its consistency seems to us presently out of reach for MCMR, as such a result does not yet exist in the simpler case where m = 0 (that is for hidden Markov models). Résumé. Ce travail porte sur l’identification de l’ordre d’une chaîne de Markov à régime Markovien (MCMR) sur un alphabet fini. L’ordre d’une MCMR est défini comme le couple (k, m) où k est le nombre d’états de la chaîne cachée et m la mémoire de la chaîne de Markov conditionnelle. Nous étudions des estimateurs du maximum de vraisemblance pénalisée en utilisant des techniques issues de
Discrete Universal Filtering via Hidden Markov Modelling
"... Abstract — We consider the discrete universal filtering problem, where the components of a discrete signal emitted by an unknown source and corrupted by a known DMC are to be causally estimated. We derive a family of filters which we show to be universally asymptotically optimal in the sense of achi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract — We consider the discrete universal filtering problem, where the components of a discrete signal emitted by an unknown source and corrupted by a known DMC are to be causally estimated. We derive a family of filters which we show to be universally asymptotically optimal in the sense of achieving the optimum filtering performance when the clean signal is stationary, ergodic, and satisfies an additional mild positivity condition. Our schemes are based on approximating the noisy signal by a hidden Markov process (HMP) via maximumlikelihood (ML) estimation, followed by use of the wellknown forward recursions for HMP state estimation. We show that as the data length increases, and as the number of states in the HMP approximation increases, our family of filters attain the performance of the optimal distributiondependent filter. I.
ON THE MINIMAL PENALTY FOR MARKOV ORDER ESTIMATION
, 908
"... We show that largescale typicality of Markov sample paths implies that the likelihood ratio statistic satisfies a law of iterated logarithm uniformly to the same scale. As a consequence, the penalized likelihood Markov order estimator is strongly consistent for penalties growing as slowly as log lo ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We show that largescale typicality of Markov sample paths implies that the likelihood ratio statistic satisfies a law of iterated logarithm uniformly to the same scale. As a consequence, the penalized likelihood Markov order estimator is strongly consistent for penalties growing as slowly as log log n when an upper bound is imposed on the order which may grow as rapidly as log n. Our method of proof, using techniques from empirical process theory, does not rely on the explicit expression for the maximum likelihood estimator in the Markov case and could therefore be applicable in other settings.
Testing statistical hypotheses about ergodic processes
 Eprint, arxiv.org
, 2008
"... We propose a method for statistical analysis of time series, that allows us to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, we consider three problems: goodnessoffit (or identity ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We propose a method for statistical analysis of time series, that allows us to obtain solutions to some classical problems of mathematical statistics under the only assumption that the process generating the data is stationary ergodic. Namely, we consider three problems: goodnessoffit (or identity) testing, process classification, and the change point problem. For each of the problems we construct a test that is asymptotically accurate for the case when the data is generated by stationary ergodic processes. The tests are based on empirical estimates of distributional distance.
Joint universal lossy coding and identification of stationary mixing sources with general alphabets
 IEEE Trans. Inform. Theory
"... Abstract — We consider the problem of joint universal variablerate lossy coding and identification for parametric classes of stationary βmixing sources with general (Polish) alphabets. Compression performance is measured in terms of Lagrangians, while identification performance is measured by the ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract — We consider the problem of joint universal variablerate lossy coding and identification for parametric classes of stationary βmixing sources with general (Polish) alphabets. Compression performance is measured in terms of Lagrangians, while identification performance is measured by the variational distance between the true source and the estimated source. Provided that the sources are mixing at a sufficiently fast rate and satisfy certain smoothness and Vapnik–Chervonenkis learnability conditions, it is shown that, for bounded metric distortions, there exist universal schemes for joint lossy compression and identification whose Lagrangian redundancies converge to zero as p Vn log n/n as the block length n tends to infinity, where Vn is the Vapnik–Chervonenkis dimension of a certain class of decision regions defined by the ndimensional marginal distributions of the sources; furthermore, for each n, the decoder can identify ndimensional marginal of the active source up to a ball of radius O ( p Vn log n/n) in variational distance, eventually with probability one. The results are supplemented by several examples of parametric sources satisfying the regularity conditions. Index Terms—Learning, minimumdistance density estimation, twostage codes, universal vector quantization, Vapnik– Chervonenkis dimension. I.