## Multiresolution markov models for signal and image processing (2002)

### Cached

### Download Links

- [sensorweb.mit.edu]
- [ssg.mit.edu]
- [ssg.mit.edu]
- [ssg.mit.edu]
- [www.iro.umontreal.ca]
- CiteULike

### Other Repositories/Bibliography

Venue: | Proceedings of the IEEE |

Citations: | 123 - 17 self |

### BibTeX

@INPROCEEDINGS{Willsky02multiresolutionmarkov,

author = {Alan S. Willsky},

title = {Multiresolution markov models for signal and image processing},

booktitle = {Proceedings of the IEEE},

year = {2002},

pages = {1396--1458},

publisher = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper reviews a significant component of the rich field of statistical multiresolution (MR) modeling and processing. These MR methods have found application and permeated the literature of a widely scattered set of disciplines, and one of our principal objectives is to present a single, coherent picture of this framework. A second goal is to describe how this topic fits into the even larger field of MR methods and concepts–in particular making ties to topics such as wavelets and multigrid methods. A third is to provide several alternate viewpoints for this body of work, as the methods and concepts we describe intersect with a number of other fields. The principle focus of our presentation is the class of MR Markov processes defined on pyramidally organized trees. The attractiveness of these models stems from both the very efficient algorithms they admit and their expressive power and broad applicability. We show how a variety of methods and models relate to this framework including models for self-similar and 1/f processes. We also illustrate how these methods have been used in practice. We discuss the construction of MR models on trees and show how questions that arise in this context make contact with wavelets, state space modeling of time series, system and parameter identification, and hidden

### Citations

8919 | Maximum likelihood from incomplete data via the EM algorithm (with discussion
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...dden Markov tree models illustrated in Example 6 and developed in detail in [59], [80], [261], [281], and [282], a very effective approach to parameter estimation involves the use of the EM algorithm =-=[97]-=-. The employment of EM requires the specification of the so-called complete data, which includes not only the actual measured data but also some additional hidden variables, which, if available, make ... |

7440 |
Probabilistie Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...e class 1398 PROCEEDINGS OF THE IEEE, VOL. 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models [35], [36], [89], [108], [123], [128], [143], [168]–[170], [197], [204], [236], =-=[267]-=-, [294], [295], [302], [337], [339], [357]. It is the exploitation of this Markovian property that leads to the efficient algorithms that we describe. C. Getting Oriented A fair question to ask is: fo... |

4567 | A tutorial on hidden Markov models and selected applications in speech recognition
- Rabiner
- 1989
(Show Context)
Citation Context ...e role of capturing the intrinsic memory in the signals that are observed or of primary interest. The models we describe also have close ties to hidden Markov models (HMMs) [80], [222], [261], [265], =-=[272]-=-, [281], [302], in which the hidden variables may represent higher level descriptors which we wish to estimate, as in speech analysis, image segmentation, and higher level vision problems [42], [53], ... |

4017 |
Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images
- Geman, Geman
- 1984
(Show Context)
Citation Context ...m such features while preserving those features with minimal distortion. Included in the literature are methods based on explicit modeling of edges and other boundary-like features (see, for example, =-=[132]-=- and [234]), approaches that use non-Gaussian models in order to better capture the “heavy tail” nature of imagery (for example, the generalized Gaussian models studied in depth in [41]) and an array ... |

2286 |
A Wavelet Tour of Signal Processing
- Mallat
- 1999
(Show Context)
Citation Context ... effects of distant parts of a random field with coarser aggregate values, providing substantial computational gains for many problems. Similarly, wavelet-based methods [37], [38], [89], [95], [215], =-=[228]-=-, [247], [264], [276], [286], [329], [335], [361] provide potentially significant speed-ups for a variety of computationally intensive problems. (a) B. Our Starting Point A key characteristic of MR me... |

2082 |
Matrix Computation
- Golub, Loan
- 1989
(Show Context)
Citation Context ...llowed by backsubstitution (the RTS smoothing step), yielding the complexity discussed previously. To be sure, other methods of numerical linear algebra (e.g., conjugate gradient or multipole methods =-=[138]-=-, [256], [280]) could be used to solve this equation with this same order complexity. However, what is particularly important about the MR algorithm are both its noniterative nature and especially the... |

1909 | lectures on wavelets - Ten - 1992 |

1778 | Atomic decomposition by basis pursuit - Chen, Donoho, et al. - 1998 |

1516 |
Exactly solved models in statistical mechanics
- Baxter
- 1982
(Show Context)
Citation Context ...pecified in terms of the distribution at the root node and the parent–child transition distributions for every node . Such models have a long history, extending back to studies in statistical physics =-=[26]-=-, dynamic programming [32], artificial intelligence and other investigations of graphical models [7], [89], [128], [169], [267], [294], [295], and signal and image processing [42], [58], [59], [80], [... |

1483 |
Robot Vision
- Horn
- 1986
(Show Context)
Citation Context ...vision, namely that of reconstructing surfaces from regular or irregularly sampled measurements of surface height and/or of the normal to the surface (as in the shape-from-shading problem [34], [53], =-=[152]-=-). One well-known approach to reconstruction problems such as this involves the use of a variational formulation. In particular, let denote the 2-D planar region over which the surface is defined, and... |

1431 | System identification: Theory for the user - LJUNG - 1980 |

1313 |
Embedded image coding using zerotree of wavelet coefcients
- Shapiro
- 1993
(Show Context)
Citation Context ...ively smooth regions of an image and thus are quite small, while others, corresponding to locations of edges, can be very large. Furthermore, as is also discussed in the literature (e.g., [46], [80], =-=[296]-=-, [299], [300], and [333]), large wavelet coefficients generally form cascades that are localized in space and propagate across scale, reflecting the presence of edges. Indeed, so-called embedded zero... |

1239 |
Spatial Interaction and the Statistical Analysis of Lattice Systems (with Discussion
- Besag
- 1974
(Show Context)
Citation Context ...ard Markov processes in time, with Markov random fields (MRFs) and with the large class 1398 PROCEEDINGS OF THE IEEE, VOL. 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models =-=[35]-=-, [36], [89], [108], [123], [128], [143], [168]–[170], [197], [204], [236], [267], [294], [295], [302], [337], [339], [357]. It is the exploitation of this Markovian property that leads to the efficie... |

1155 | Graphical Models
- Lauritzen
- 1996
(Show Context)
Citation Context ... with the large class 1398 PROCEEDINGS OF THE IEEE, VOL. 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models [35], [36], [89], [108], [123], [128], [143], [168]–[170], [197], =-=[204]-=-, [236], [267], [294], [295], [302], [337], [339], [357]. It is the exploitation of this Markovian property that leads to the efficient algorithms that we describe. C. Getting Oriented A fair question... |

1090 | The Laplacian pyramid as a compact image code
- Burt, Adelson
- 1983
(Show Context)
Citation Context ...t ranges of spatial or spatio–temporal scales [31], [112], [198], [219], [220], [352], [354]. Studies of large classes of natural imagery also show characteristic variability at multiple scales [46], =-=[47]-=-, [140], [157], [218], [243], [250], [261], [268], [281], [297]–[300], [333], as do mathematical models of self-similar or fractal processes [288] such as fractional Brownian motion (fBm) [30], [83], ... |

990 |
Optimal approximations by piecewise smooth functions and associated variational problems
- MUMFORD, SHAH
- 1989
(Show Context)
Citation Context ...sume that this structure is fixed and given. A typical 47 The formulation in [292] represents a relaxed version of the widelystudied Mumford–Shah functional for image denoising and segmentation [11], =-=[252]-=-. example, 48 to which we refer on occasion, is that shown in Fig. 18. Here we begin with a zero-mean Gaussian process , , whose second-order statistics are given and which we wish to realize, either ... |

987 | On the statistical analysis of dirty pictures - Besag - 1986 |

892 | Fundamentals of Statistical Signal Processing: Estimation Theory - Kay - 1993 |

866 | An introduction to variational methods for graphical models. Learning in Graphical Models
- Jordan
- 1999
(Show Context)
Citation Context ... brief glimpse at inference algorithms for models on more general graphs, a topic which has been and remains the subject of numerous investigations (see, e.g., [35], [36], [128], [132], [164], [169], =-=[170]-=-, [200], [204], [208], [222], [267], [269], and [339]) and to which we return again in Section VII. To begin, consider the estimation of the state (assumed to be zero-mean for simplicity) of a Gaussia... |

860 |
Probability Random Variables and Stochastic Processes, 3rd Ed
- Papoulis
- 1991
(Show Context)
Citation Context ...omputations of marginals at all nodes can be computed by a coarse-to-fine tree recursion generalizing the usual Chapman–Kolmogorov equation for recursive computation of distributions in Markov chains =-=[266]-=-. Similarly, joints for one node and several of its descendants can be calculated efficiently, and then by averaging over that ancestor node we can obtain joints for any set of nodes [yielding the cou... |

800 | The Viterbi Algorithm
- Forney
- 1973
(Show Context)
Citation Context ...e structure and spirit, something that has been emphasized in several investigations [7], [169], [294], [295]. Computing the MAP estimate involves a generalization of the well-known Viterbi algorithm =-=[118]-=-, one that can be traced at least back to the study of so-called “nonserial dynamic programming” [32] and to the work of others in artificial intelligence and graphical models [7], [89], [169], WILLSK... |

782 |
Statistics for Long Memory Processes
- Beran
- 1994
(Show Context)
Citation Context ...s [46], [47], [140], [157], [218], [243], [250], [261], [268], [281], [297]–[300], [333], as do mathematical models of self-similar or fractal processes [288] such as fractional Brownian motion (fBm) =-=[30]-=-, [83], [116], [232], [313], motivating examinations of the properties of the wavelet transforms of such signals and images [69], [83], [102], [114], [117], [154], [176], [191], [235], [273], [293], [... |

777 |
Optimal Filtering
- Anderson, Moore
- 1979
(Show Context)
Citation Context ...on itself as follows: (16) which is nothing more than the generalization of the usual Lyapunov equation for the evolution of the state covariance of temporal state space systems driven by white noise =-=[12]-=-, [174], [182]. Note that this computation directly produces the diagonal blocks of the overall covariance matrix for , and the total complexity of this calculation is . The quadratic dependence on th... |

746 | Adapting to unknown smoothness via wavelet shrinkage
- Donoho, Johnstone
- 1995
(Show Context)
Citation Context ...pture the “heavy tail” nature of imagery (for example, the generalized Gaussian models studied in depth in [41]) and an array of procedures using wavelet transforms (e.g., [2], [57]–[59], [68], [80], =-=[104]-=-, [192], [193], [261], [281], [301], [330], and [333]). For this latter set of methods, the general idea is to exploit the localization properties of wavelets to allow much easier and more transparent... |

682 | Approximating discrete probability distributions with dependence trees
- Chow, Liu
- 1968
(Show Context)
Citation Context ...ated to the representation of phenomena at different scales and spatial locations. Nevertheless, it is worth noting that the topic of identifying the structure of the tree has received some attention =-=[64]-=-, [177], [230], [238], [239], [303], mostly in fields other than signal and image processing. Perhaps the best known work in this area is that of Chow and Liu [64]. The idea in this work is that we ar... |

650 | Learning in Graphical Models
- Jordan
- 1998
(Show Context)
Citation Context ...om fields (MRFs) and with the large class 1398 PROCEEDINGS OF THE IEEE, VOL. 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models [35], [36], [89], [108], [123], [128], [143], =-=[168]-=-–[170], [197], [204], [236], [267], [294], [295], [302], [337], [339], [357]. It is the exploitation of this Markovian property that leads to the efficient algorithms that we describe. C. Getting Orie... |

604 |
The computational complexity of probabilistic inference using Bayesian belief network
- Cooper
- 1990
(Show Context)
Citation Context ...ze , i.e., that grows exponentially with , and explicit computation of projections of this distribution corresponding to particular marginals or joints has been shown to be NP-Hard for general graphs =-=[76]-=-. However, for an MR process on a tree, represents a generalization of a Markov chain, and computations of marginals at all nodes can be computed by a coarse-to-fine tree recursion generalizing the us... |

576 |
Fractional Brownian motion, fractional noises and applications
- Mandelbrot, Ness
- 1968
(Show Context)
Citation Context ..., [157], [218], [243], [250], [261], [268], [281], [297]–[300], [333], as do mathematical models of self-similar or fractal processes [288] such as fractional Brownian motion (fBm) [30], [83], [116], =-=[232]-=-, [313], motivating examinations of the properties of the wavelet transforms of such signals and images [69], [83], [102], [114], [117], [154], [176], [191], [235], [273], [293], [320], [346]–[350], [... |

565 |
Direct method for sparse matrices
- Duff, Erisman, et al.
- 1986
(Show Context)
Citation Context ...of a covariance matrix (see, e.g., the references in the following discussion of DBNs). The method just described has close relationships both to well-known methods for the numerical solution of PDEs =-=[106]-=-, [138] and to algorithms and ideas for space–time processes and DBNs. In particular, as described in [81] (see also [181]), suppose that, instead of beginning with the middle red row in Fig. 14, we b... |

556 |
A Multigrid Tutorial
- Briggs
- 1987
(Show Context)
Citation Context ...ion of large systems of equations [e.g., representing discretizations of partial differential equations (PDEs)]. Multigrid methods WILLSKY: MR MARKOV MODELS FOR SIGNAL AND IMAGE PROCESSING 1397[44], =-=[45]-=-, [109], [190], [319] represent one class of examples in which coarser (and hence computationally simpler) versions of a problem are used to guide (and thus accelerate) the solution of finer versions,... |

531 | Entropy-based algorithms for best-basis selection
- Coifman, Wickerhauser
- 1992
(Show Context)
Citation Context ...els first introduced in Example 1 and discussed further later in this section and in Section VI-B. Finally, there is also a substantial body of work on so-called adaptive representations (e.g., [52], =-=[73]-=-, [192], [193], [228], and [229]) using entire families or “dictionaries” of bases, which taken together generally form vastly overcomplete sets. The objective in each of these methods is to select on... |

510 | Factorial hidden Markov models - Ghahramani, Jordan - 1997 |

499 | Wavelets and subband coding
- Vetterli, Kovacevic
- 1995
(Show Context)
Citation Context ...om field with coarser aggregate values, providing substantial computational gains for many problems. Similarly, wavelet-based methods [37], [38], [89], [95], [215], [228], [247], [264], [276], [286], =-=[329]-=-, [335], [361] provide potentially significant speed-ups for a variety of computationally intensive problems. (a) B. Our Starting Point A key characteristic of MR methods or models is that they introd... |

465 |
Graphical models in applied multivariate statistics
- Whittaker
- 1990
(Show Context)
Citation Context ...E, VOL. 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models [35], [36], [89], [108], [123], [128], [143], [168]–[170], [197], [204], [236], [267], [294], [295], [302], [337], =-=[339]-=-, [357]. It is the exploitation of this Markovian property that leads to the efficient algorithms that we describe. C. Getting Oriented A fair question to ask is: for whom is this paper written? A rep... |

462 | Shiftable multi-scale transforms
- Simoncelli, Freeman, et al.
- 1992
(Show Context)
Citation Context ...roaches to dealing with this, including those described in Section VI and also in Section VII. What was used to produce the results in Fig. 9 is the same simple method used by others [72], [270], and =-=[298]-=-, namely, averaging the estimation results using several different tree models, each of which is shifted slightly with respect to the others, so that the overall average smoothes out these artifacts. ... |

434 |
Fast wavelet transforms and numerical algorithms
- Beylkin, Coifman, et al.
- 1991
(Show Context)
Citation Context ...], [256], [280] approximate the effects of distant parts of a random field with coarser aggregate values, providing substantial computational gains for many problems. Similarly, wavelet-based methods =-=[37]-=-, [38], [89], [95], [215], [228], [247], [264], [276], [286], [329], [335], [361] provide potentially significant speed-ups for a variety of computationally intensive problems. (a) B. Our Starting Poi... |

427 |
Theory and Practice of Recursive Identification
- Ljung, Söderström
- 1986
(Show Context)
Citation Context ...ses. In particular, one of the keys to the computation of likelihood functions for temporal state models—and, in fact, one of the key concepts more generally for temporal models of other forms [216], =-=[217]-=-—is the concept of whitening the measurements, i.e., of recursively producing predictions of each successive measurement (using a temporal Kalman filter), which, when subtracted from the actual measur... |

425 | Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance, Chapman - Samorodnitsky, Taqqu - 1994 |

410 | Generalized belief propagation
- Yedidia, Freeman, et al.
- 2000
(Show Context)
Citation Context ... 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models [35], [36], [89], [108], [123], [128], [143], [168]–[170], [197], [204], [236], [267], [294], [295], [302], [337], [339], =-=[357]-=-. It is the exploitation of this Markovian property that leads to the efficient algorithms that we describe. C. Getting Oriented A fair question to ask is: for whom is this paper written? A reply that... |

339 | Turbo decoding as an instance of Pearl’s ’belief propagation’ algorithm - McEliece, MacKay, et al. - 1998 |

337 | Wavelet-Based Statistical Signal Processing Using Hidden Markov Models
- Crouse, Nowak, et al.
- 1998
(Show Context)
Citation Context ...ariables may simply play the role of capturing the intrinsic memory in the signals that are observed or of primary interest. The models we describe also have close ties to hidden Markov models (HMMs) =-=[80]-=-, [222], [261], [265], [272], [281], [302], in which the hidden variables may represent higher level descriptors which we wish to estimate, as in speech analysis, image segmentation, and higher level ... |

308 |
The generalized distributive law
- McEliece, Aji
(Show Context)
Citation Context ...for every node . Such models have a long history, extending back to studies in statistical physics [26], dynamic programming [32], artificial intelligence and other investigations of graphical models =-=[7]-=-, [89], [128], [169], [267], [294], [295], and signal and image processing [42], [58], [59], [80], [175], [199], [213], [261], [281], [283]. Later in this paper we will illustrate examples of such mod... |

302 | Introduction to Spectral Analysis - Stoica, Moses - 1997 |

275 |
Graphical Models for Machine Learning and Digital Communicaiton
- Frey
- 1998
(Show Context)
Citation Context ...th Markov random fields (MRFs) and with the large class 1398 PROCEEDINGS OF THE IEEE, VOL. 90, NO. 8, AUGUST 2002of Bayes’ nets, belief networks, and graphical models [35], [36], [89], [108], [123], =-=[128]-=-, [143], [168]–[170], [197], [204], [236], [267], [294], [295], [302], [337], [339], [357]. It is the exploitation of this Markovian property that leads to the efficient algorithms that we describe. C... |

269 | Tractable inference for complex stochastic processes
- Boyen, Koller
- 1998
(Show Context)
Citation Context ...., processes on grids such as in Fig. 14 in which one of the two independent variables is time. The idea of propagating approximate graphical models in time is a topic of significant current interest =-=[43]-=-, [134], [254], and we refer the readers to these references for details. We note in particular that in [43] the authors confront a problem of considerable concern not only for DBNs but for the approx... |

253 | A multiscale random field model for Bayesian image segmentation
- Bouman, Shapiro
- 1994
(Show Context)
Citation Context ...[265], [272], [281], [302], in which the hidden variables may represent higher level descriptors which we wish to estimate, as in speech analysis, image segmentation, and higher level vision problems =-=[42]-=-, [53], [59], [175], [179], [180], [183], [199], [283], [323]. Whatever the nature of the variables defined on such a tree, there is one critical property that they must satisfy, namely, that collecti... |

248 | A generalized Gaussian image model for edge-preserving MAP estimation
- Bouman, Sauer
- 1993
(Show Context)
Citation Context ... for example, [132] and [234]), approaches that use non-Gaussian models in order to better capture the “heavy tail” nature of imagery (for example, the generalized Gaussian models studied in depth in =-=[41]-=-) and an array of procedures using wavelet transforms (e.g., [2], [57]–[59], [68], [80], [104], [192], [193], [261], [281], [301], [330], and [333]). For this latter set of methods, the general idea i... |

247 |
Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hipergraphs, and Selectively Reduce Acyclic Hypergraphs
- Tarjan, Yannakakis
- 1984
(Show Context)
Citation Context ...ent ways in which to triangulate a graph, and finding a triangulation that leads to a junction tree with small maximal cliques can be a challenging graph-theoretic problem [169], [204], [207], [314], =-=[315]-=-. 42 Moreover, for some graphs, all triangulations have maximal cliques that are quite large. For example, the triangulation of the regular 2-D graph in Fig. 14 is nontrivial, requiring much more than... |

242 | Multiresolution sampling procedure for analysis and synthesis of texture images
- DEBONET
- 1997
(Show Context)
Citation Context ...omplex graph that does not yield a junction tree or cutset tree model with acceptably small state dimension. Another very interesting approach to MR modeling for image processing is that developed in =-=[90]-=-–[92]. The basic idea behind this approach is quite simple. Given a sample image, we form an MR pyramid by performing an MR decomposition of the image—the specific decomposition used in [91] is an ove... |

234 | Correctness of belief propagation in gaussian graphical models of arbitrary topology
- Weiss, Freeman
(Show Context)
Citation Context ...algorithm in Section IV-B. In [334], examples are given demonstrating that such embedded tree (ET) algorithms can lead to very efficient methods for computing the optimal estimates. As with BP [285], =-=[338]-=-, if an ET algorithm converges, it does so to the optimal estimate. Moreover, while BP does not yield the correct error covariances, it is shown in [334] that the computations performed in an ET algor... |