## Wavelet-Based Statistical Signal Processing Using Hidden Markov Models (1998)

Citations: | 325 - 52 self |

### BibTeX

@MISC{Crouse98wavelet-basedstatistical,

author = {Matthew Crouse and Robert Nowak and Richard Baraniuk},

title = {Wavelet-Based Statistical Signal Processing Using Hidden Markov Models},

year = {1998}

}

### Years of Citing Articles

### OpenURL

### Abstract

Wavelet-based statistical signal processing techniques such as denoising and detection typically model the wavelet coefficients as independent or jointly Gaussian. These models are unrealistic for many real-world signals. In this paper, we develop a new framework based on wavelet-domain hidden Markov models (HMMs). The framework enables us to concisely model the statistical dependencies and nonGaussian statistics often encountered in practice. Wavelet-domain HMMs are designed with the intrinsic properties of the wavelet transform in mind and provide powerful yet tractable probabilistic signal models. Efficient Expectation Maximization algorithms are developed for fitting the HMMs to observational signal data. The new framework is suitable for a wide range of applications, including signal estimation, detection, classification, prediction, and even synthesis. To demonstrate the utility of wavelet-domain HMMs, we develop novel algorithms for signal denoising, classification, and detectio...

### Citations

8134 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...en). 4 Yet, given the values of the states, ML estimation of is simple (merely ML estimation of Gaussian means and variances). Therefore, we employ an iterative expectation maximization (EM) approach =-=[31]-=-, which jointly estimates both the model parameters and probabilities for the hidden states , given the observed wavelet coefficients . In the context of HMM’s, the EM algorithm is also known as the B... |

7067 |
Probabilistic Reasoning in Intelligence Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...est. Hence, very accurate and practical models can be obtained with probabilistic links between the states of only neighboring wavelet coefficients. We will now apply probabilistic graph theory [20], =-=[27]-=-, [28] to develop these models. 2) Graph Models for Wavelet Transforms: Probabilistic graphs are useful tools for modeling the local dependencies between a set of random variables [20], [27], [28]. Ro... |

4286 | A tutorial on hidden Markov models and selected applications in speech recognition
- Rabiner
- 1989
(Show Context)
Citation Context ...r tree [see Fig. 3(b)]. Models of this type, which are commonly referred to as hidden Markov models (HMM’s), have proved tremendously useful in a variety of applications, including speech recognition =-=[18]-=-, [19] and artificial intelligence [20]. We will investigate three simple probabilistic graphs with state-to-state connectivities shown in Fig. 3(b). The indepen-888 IEEE TRANSACTIONS ON SIGNAL PROCE... |

1724 |
Ten Lectures on Wavelets
- Daubechies
- 1992
(Show Context)
Citation Context ...mic decomposition that represents a one-dimensional (1-D) signal in terms of shifted and dilated versions of a prototype bandpass wavelet function , and shifted versions of a lowpass scaling function =-=[8]-=-, [9]. For special choices of the wavelet and scaling functions, the atoms form an orthonormal basis, and we have the signal representation [8], [9] with , and . In this representation, indexes the sc... |

1251 |
Embedded image coding using zerotrees of wavelet coefficients
- Shapiro
- 1993
(Show Context)
Citation Context ...al and image processing. The wavelet domain provides a natural setting for many applications involving real-world signals, including estimation [1]–[3], detection [4], classification [4], compression =-=[5]-=-, prediction and filtering [6], and synthesis [7]. The remarkable properties of the wavelet transform have led to powerful signal processing methods based on simple scalar transformations of individua... |

681 | Adapting to unknown smoothness via wavelet shrinkage
- Donoho, Johnstone
- 1995
(Show Context)
Citation Context ...has emerged as an exciting new tool for statistical signal and image processing. The wavelet domain provides a natural setting for many applications involving real-world signals, including estimation =-=[1]-=-–[3], detection [4], classification [4], compression [5], prediction and filtering [6], and synthesis [7]. The remarkable properties of the wavelet transform have led to powerful signal processing met... |

508 |
Mixture Densities, Maximum Likelihood and the EM Algorithm
- Redner, Walker
- 1984
(Show Context)
Citation Context ...rbitrarily close for densities with a finite number of discontinuities [25]. We can even mix non-Gaussian densities, such as conditional densities belonging to the exponential family of distributions =-=[26]-=-. However, the two-state, zero-mean Gaussian mixture model is simple, robust, and easy-to-use—attractive features for many applications. For purposes of instruction, we will focus on the simple two-st... |

496 |
Characterization of signals from multiscale edges
- Mallat, Zhong
- 1992
(Show Context)
Citation Context ...lar wavelet coefficient is large/small, then adjacent coefficients are very likely to also be large/small [14]. Persistence: Large/small values of wavelet coefficients tend to propagate across scales =-=[15]-=-, [16]. As we see in Fig. 2, these are striking features of the wavelet transform. They have been exploited with great success by the compression community [5], [14]. Our goal is to do the same for si... |

474 |
Wavelets and subband coding
- Vetterli, Kovačevic
- 1995
(Show Context)
Citation Context ...ecomposition that represents a one-dimensional (1-D) signal in terms of shifted and dilated versions of a prototype bandpass wavelet function , and shifted versions of a lowpass scaling function [8], =-=[9]-=-. For special choices of the wavelet and scaling functions, the atoms form an orthonormal basis, and we have the signal representation [8], [9] with , and . In this representation, indexes the scale o... |

391 | Singularity detection and processing with wavelets
- Mallat, Hwang
- 1992
(Show Context)
Citation Context ...velet coefficient is large/small, then adjacent coefficients are very likely to also be large/small [14]. Persistence: Large/small values of wavelet coefficients tend to propagate across scales [15], =-=[16]-=-. As we see in Fig. 2, these are striking features of the wavelet transform. They have been exploited with great success by the compression community [5], [14]. Our goal is to do the same for signal p... |

271 |
Discrete-Time Processing of Speech Signals
- Deller, Proakis, et al.
- 1994
(Show Context)
Citation Context ...ypical autoregressive (AR) signals used in nonlinear classification experiment. (a) Linear AR process (Class I). (b) Linear AR process passed through a mild cubic nonlinearity (Class II). recognition =-=[19]-=-, where each signal class is a specific word or utterance. A slightly different approach developed for timedomain HMM’s has been shown to be asymptotically optimal in the Neyman–Pearson sense for two-... |

234 | A multiscale random field model for Bayesian image segmentation
- Bouman, Shapiro
- 1994
(Show Context)
Citation Context ...sseville et al. emphasize linear Gaussian models [6]. Wavelet-domain HMM’s are nonlinear and non-Gaussian and do not constrain the wavelet coefficients to be strictly Markov. The multiscale models of =-=[23]-=-, which are used for segmentation, have a Markov tree of state variables similar to that of the HMT model. However, these models are applied directly to the signal (which is not tree-structured), rath... |

221 |
Signal Processing: Detection, Estimation and Time Series Analysis
- Scharf
- 1991
(Show Context)
Citation Context ...alization.) For comparison, we constructed a minimum-probability-of-error quadratic detector under the assumption that the two classes have Gaussian distributions with different means and covariances =-=[36]-=-, with the means and covariances estimated from the training data. The quadratic detector is not optimal since the second class is non-Gaussian. In cases where the number of training observations was ... |

201 | Wavelet thresholding via a Bayesian approach
- Abramovich, Sapatinas, et al.
- 1998
(Show Context)
Citation Context ...t for practical application in real-world problems. Until recently, wavelet coefficients have been modeled either as jointly Gaussian [4], [6], [10], [11] or as non-Gaussian but independent [2], [3], =-=[12]-=-, [13]. Jointly Gaussian models can efficiently capture linear correlations between wavelet coefficients. However, Gaussian models are in conflict with the compression property, which implies that the... |

201 | Noise removal via bayesian wavelet coring
- Simoncelli, Adelson
- 1996
(Show Context)
Citation Context ...practical application in real-world problems. Until recently, wavelet coefficients have been modeled either as jointly Gaussian [4], [6], [10], [11] or as non-Gaussian but independent [2], [3], [12], =-=[13]-=-. Jointly Gaussian models can efficiently capture linear correlations between wavelet coefficients. However, Gaussian models are in conflict with the compression property, which implies that the wavel... |

181 |
Wavelet analysis and synthesis of fractional Brownian motion
- Flandrin
- 1992
(Show Context)
Citation Context ...des a natural setting for many applications involving real-world signals, including estimation [1]–[3], detection [4], classification [4], compression [5], prediction and filtering [6], and synthesis =-=[7]-=-. The remarkable properties of the wavelet transform have led to powerful signal processing methods based on simple scalar transformations of individual wavelet coefficients. These methods implicitly ... |

173 |
Adaptive Bayesian wavelet shrinkage
- Chipman, Kolaczyk, et al.
- 1997
(Show Context)
Citation Context ...emerged as an exciting new tool for statistical signal and image processing. The wavelet domain provides a natural setting for many applications involving real-world signals, including estimation [1]–=-=[3]-=-, detection [4], classification [4], compression [5], prediction and filtering [6], and synthesis [7]. The remarkable properties of the wavelet transform have led to powerful signal processing methods... |

167 | Probabilistic independence networks for hidden Markov probability models
- Smyth, Heckerman, et al.
- 1997
(Show Context)
Citation Context ...type, which are commonly referred to as hidden Markov models (HMM’s), have proved tremendously useful in a variety of applications, including speech recognition [18], [19] and artificial intelligence =-=[20]-=-. We will investigate three simple probabilistic graphs with state-to-state connectivities shown in Fig. 3(b). The indepen-888 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 46, NO. 4, APRIL 1998 (a) (... |

165 |
Universal approximation using radial-basis-function networks
- Park, Sandberg
- 1991
(Show Context)
Citation Context ...s, and we can realize any weighted mixture of the M 2 bivariate Gaussians we desire. Appealing to the approximation capabilities of Gaussian mixtures [25] (analogous to radial basis function networks =-=[30]-=-), as M increases, the bivariate Gaussian mixture is capable of approximating any bivariate parent–child pdf with a finite number of discontinuities to arbitrary precision. Conversely, given the child... |

152 | Image coding based on mixture modeling of wavelet coefficients and a fast estimation-quantization framework - LoPresto, Ramchandran, et al. - 1997 |

94 | Modeling and estimation of multiresolution stochastic processes - Basseville, Benveniste, et al. - 1992 |

92 |
Recursive Bayesian estimation using Gaussian sums
- Sorenson, Alspach
- 1971
(Show Context)
Citation Context ...zero means in the Gaussian mixing densities. By increasing the number of states and allowing nonzero means, we can make the fit arbitrarily close for densities with a finite number of discontinuities =-=[25]-=-. We can even mix non-Gaussian densities, such as conditional densities belonging to the exponential family of distributions [26]. However, the two-state, zero-mean Gaussian mixture model is simple, r... |

84 | Multiscale representations of Markov random fields
- Luettgen, Karl, et al.
- 1993
(Show Context)
Citation Context ...ide variety of data yet concise, tractable, and efficient for practical application in real-world problems. Until recently, wavelet coefficients have been modeled either as jointly Gaussian [4], [6], =-=[10]-=-, [11] or as non-Gaussian but independent [2], [3], [12], [13]. Jointly Gaussian models can efficiently capture linear correlations between wavelet coefficients. However, Gaussian models are in confli... |

65 | Detection of Abrupt Changes - Basseville, Nikiforov - 1993 |

65 | The mean field theory in EM procedures for Markov random fields - Zhang - 1992 |

49 | Local discriminant bases
- Saito, Coifman
- 1994
(Show Context)
Citation Context ... a binary tree, each node that is not itself a leaf has two children. When viewed in the time–frequency plane as in Fig. 1, a wavelet transforms has a natural organization as a forest of binary trees =-=[24]-=-. 1 The tree(s) are rooted at the wavelet coefficients in the coarsest scale (lowest frequency) band; a single scaling coefficient sits above each root. Depending on the length of the signal and the n... |

32 | Jump and sharp cusp detection by wavelets
- Wang
- 1995
(Show Context)
Citation Context ...M’s has been shown to be asymptotically optimal in the Neyman–Pearson sense for two-class problems [33]. Several other wavelet-based detection and classification schemes have been proposed [4], [24], =-=[34]-=-, [35]. Our purpose is not to provide a comprehensive review of wavelet-based detection algorithms but, rather, to demonstrate the potential of the new wavelet-domain HMM framework for signal detectio... |

29 | Bayesian approach to best basis selection - Pesquet, Krim, et al. - 1996 |

22 | Progressive wavelet image coding based on a conditional probability model
- Buccigrossi, Simoncelli
- 1997
(Show Context)
Citation Context ...es between wavelet coefficients. Recently, new compression algorithms have been developed that combine the idea of exploiting dependencies with probabilistic models for the wavelet coefficients [21], =-=[22]-=-. Although similar in spirit to the probability models presented in this paper, none of these new compression algorithms use an HMM framework. Wavelet-domain HMM’s also differ considerably from the mu... |

18 | Data dependent wavelet thresholding in non-parametric regression with change-points applications
- Odgen, Parzen
- 1996
(Show Context)
Citation Context ...s been shown to be asymptotically optimal in the Neyman–Pearson sense for two-class problems [33]. Several other wavelet-based detection and classification schemes have been proposed [4], [24], [34], =-=[35]-=-. Our purpose is not to provide a comprehensive review of wavelet-based detection algorithms but, rather, to demonstrate the potential of the new wavelet-domain HMM framework for signal detection and ... |

15 |
An investigation of wavelet-based image coding using an entropy-constrained quantization framework
- Orchard, Ramchandran
- 1994
(Show Context)
Citation Context ... we have the following secondary properties of the wavelet transform. Clustering: If a particular wavelet coefficient is large/small, then adjacent coefficients are very likely to also be large/small =-=[14]-=-. Persistence: Large/small values of wavelet coefficients tend to propagate across scales [15], [16]. As we see in Fig. 2, these are striking features of the wavelet transform. They have been exploite... |

14 |
Parameter estimation of dependence tree models using the em algorithm
- Ronen, Rohlicek, et al.
- 1995
(Show Context)
Citation Context ...v chain models have been developed thoroughly in [18] and [26], so we do not include them in this paper. For more general tree models, Ronen et al. provide specific EM steps for discrete variables in =-=[32]-=-. Since the observed wavelet data in the HMT model is continuous valued, we provide a new EM algorithm for this model in the Appendix. B. Likelihood Determination The E step of the EM algorithm is use... |

12 |
Which stochastic models allow BaumWelch training?,” Signal Processing
- Lucke
- 1996
(Show Context)
Citation Context ...ence, very accurate and practical models can be obtained with probabilistic links between the states of only neighboring wavelet coefficients. We will now apply probabilistic graph theory [20], [27], =-=[28]-=- to develop these models. 2) Graph Models for Wavelet Transforms: Probabilistic graphs are useful tools for modeling the local dependencies between a set of random variables [20], [27], [28]. Roughly ... |

8 |
A multiscale stochastic modeling approach to the monitoring of mechanical systems
- Chou, Heck
- 1994
(Show Context)
Citation Context ...riety of data yet concise, tractable, and efficient for practical application in real-world problems. Until recently, wavelet coefficients have been modeled either as jointly Gaussian [4], [6], [10], =-=[11]-=- or as non-Gaussian but independent [2], [3], [12], [13]. Jointly Gaussian models can efficiently capture linear correlations between wavelet coefficients. However, Gaussian models are in conflict wit... |

7 |
Simplified wavelet-domain hidden Markov models using contexts
- Crouse, Baraniuk
- 1997
(Show Context)
Citation Context ... computationally intensive (still linear complexity but with a large constant factor), and the algorithm may take longer to converge. Hence, it is important to keep the HMM as simple as possible. See =-=[29]-=- for a “context-based” approach that reduces complexity in HMM’s, yet still models key wavelet-domain dependencies. The specific EM steps for the IM and hidden Markov chain models have been developed ... |

6 |
et al., “Modeling and estimation of multiresolution stochastic processes
- Basseville
- 1992
(Show Context)
Citation Context ...s of the wavelet coefficients and not on the coefficients themselves. This is an important distinction between our model and other multiscale Markov signal representations such as those considered in =-=[6]-=- and [10]. Because the states are never known exactly, our HMM framework does not place a Markov structure on the wavelet coefficients directly. Let denote that the scale of (and ) and assume that the... |

4 | Signal denoising using adaptive bayesian wavelet shrinkage - Chipman, Kolaczyk, et al. - 1996 |

3 |
New methods of linear timefrequency analysis for signal detection
- Lee, Huynh, et al.
- 1996
(Show Context)
Citation Context ...medomain HMM’s has been shown to be asymptotically optimal in the Neyman–Pearson sense for two-class problems [33]. Several other wavelet-based detection and classification schemes have been proposed =-=[4]-=-, [24], [34], [35]. Our purpose is not to provide a comprehensive review of wavelet-based detection algorithms but, rather, to demonstrate the potential of the new wavelet-domain HMM framework for sig... |

3 | Hidden markov models for wavelet-based signal processing
- Crouse, Nowak, et al.
- 1996
(Show Context)
Citation Context ...ependencies within scale (horizontally in Fig. 1). In this paper, we introduce a new modeling framework that neatly summarizes the probabilistic structure of the coefficients of the wavelet transform =-=[17]-=-. Our models owe their richness and flexibility to the following features: Mixture Densities: To match the non-Gaussian nature of the wavelet coefficients, we model the marginal probability of each co... |

2 |
Universal classification for hidden markov models
- Merhav
- 1991
(Show Context)
Citation Context ...gnal class is a specific word or utterance. A slightly different approach developed for timedomain HMM’s has been shown to be asymptotically optimal in the Neyman–Pearson sense for two-class problems =-=[33]-=-. Several other wavelet-based detection and classification schemes have been proposed [4], [24], [34], [35]. Our purpose is not to provide a comprehensive review of wavelet-based detection algorithms ... |

2 | Bayesian approach to wavelet-based image processing - Malfait, Jansen, et al. - 1996 |

1 | Will23 sky, "Modeling and estimation of multiresolution stochastic processes - Basseville, Benveniste, et al. - 1992 |

1 |
Detection of Abrupt Changes. Englewood Cliffs, NJ: Prentice-Hall
- Basseville, Nikiforov
- 1993
(Show Context)
Citation Context ...oint is uniformly distributed over the integers . Examples of signals from each class are shown in Fig. 13. An excellent treatment of classical methods for the detection of abrupt changes is given in =-=[37]-=-. In addition, other waveletbased approaches to the change point problem have been discussed in the literature [34], [35]. The purpose of this example is not to make an exhaustive comparison between o... |