Results 1  10
of
36
Image denoising by sparse 3D transformdomain collaborative filtering
 IEEE TRANS. IMAGE PROCESS
, 2007
"... We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call “groups.” Collaborative filtering is a special procedure d ..."
Abstract

Cited by 218 (29 self)
 Add to MetaCart
We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call “groups.” Collaborative filtering is a special procedure developed to deal with these 3D groups. We realize it using the three successive steps: 3D transformation of a group, shrinkage of the transform spectrum, and inverse 3D transformation. The result is a 3D estimate that consists of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative filtering reveals even the finest details shared by grouped blocks and, at the same time, it preserves the essential unique features of each individual block. The filtered blocks are then returned to their original positions. Because these blocks are overlapping, for each pixel, we obtain many different estimates which need to be combined. Aggregation is a particular averaging procedure which is exploited to take advantage of this redundancy. A significant improvement is obtained by a specially developed collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its efficient implementation are presented in full detail; an extension to colorimage denoising is also developed. The experimental results demonstrate that this computationally scalable algorithm achieves stateoftheart denoising performance in terms of both peak signaltonoise ratio and subjective visual quality.
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 194 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Signal modeling techniques in speech recognition
 PROCEEDINGS OF THE IEEE
, 1993
"... We have seen three important trends develop in the last five years in speech recognition. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or timederivative, spectral information, have become common. Second, similariry transform techniques, often used to norm ..."
Abstract

Cited by 126 (5 self)
 Add to MetaCart
We have seen three important trends develop in the last five years in speech recognition. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or timederivative, spectral information, have become common. Second, similariry transform techniques, often used to normalize and decorrelate parameters in some computationally inexpensive way, have become popular. Third, the signal parameter estimation problem has merged with the speech recognition process so that more sophisticated statistical models of the signal’s spectrum can be estimated in a closedloop manner. In this paper, we review the signal processing components of these algorithms. These algorithms are presented as part of a unified view of the signal parameterization problem in which there are three major tasks: measurement, transformation, and statistical modeling. This paper is by no means a comprehensive survey of all possible techniques of signal modeling in speech recognition. There are far too many algorithms in use today to make an exhaustive survey feasible (and cohesive). Instead, this paper is meant to serve as a tutorial on signal processing in stateoftheart speech recognition systems and to review those techniques most commonly used. In keeping with this goal, a complete mathematical description of each algorithm has been included in the paper.
An InformationTheoretic Analysis of Hard and Soft Assignment Methods for Clustering
, 1997
"... Assignment methods are at the heart of many algorithms for unsupervised learning and clustering  in particular, the wellknown Kmeans and ExpectationMaximization (EM) algorithms. In this work, we study several different methods of assignment, including the "hard" assignments used by Kmeans an ..."
Abstract

Cited by 89 (0 self)
 Add to MetaCart
Assignment methods are at the heart of many algorithms for unsupervised learning and clustering  in particular, the wellknown Kmeans and ExpectationMaximization (EM) algorithms. In this work, we study several different methods of assignment, including the "hard" assignments used by Kmeans and the "soft" assignments used by EM. While it is known that Kmeans minimizes the distortion on the data and EM maximizes the likelihood, little is known about the systematic differences of behavior between the two algorithms. Here we shed light on these differences via an informationtheoretic analysis. The cornerstone of our results is a simple decomposition of the expected distortion, showing that Kmeans (and its extension for inferring general parametric densities from unlabeled sample data) must implicitly manage a tradeoff between how similar the data assigned to each cluster are, and how the data are balanced among the clusters. How well the data are balanced is measured by the en...
Online, Interactive Learning of Gestures for Human/Robot Interfaces
 In IEEE International Conference on Robotics and Automation
, 1996
"... We have developed a gesture recognition system, based on Hidden Markov Models, which can interactively recognize gestures and perform online learning of new gestures. In addition, it is able to update its model of a gesture iteratively with each example it recognizes. This system has demonstrated re ..."
Abstract

Cited by 55 (0 self)
 Add to MetaCart
We have developed a gesture recognition system, based on Hidden Markov Models, which can interactively recognize gestures and perform online learning of new gestures. In addition, it is able to update its model of a gesture iteratively with each example it recognizes. This system has demonstrated reliable recognition of 14 different gestures after only one or two examples of each. The system is currently interfaced to a Cyberglove for use in recognition of gestures from the sign language alphabet. The system is being implemented as part of an interactive interface for robot teleoperation and programming by example. 1 Introduction If we are to fully harness the potential of robotic technology, we will have to move beyond simple keyboard/mouse/teachpendant style robot programming and create comprehensive frameworks for productive realtime interaction between robots and humans. The motivations behind this kind of interaction include increasing the effectiveness of teleoperation, enablin...
On Universal Quantization by Randomized Uniform/Lattice Quantizers
 IEEE Trans. Inform. Theory
, 1992
"... Uniform quantization with dither, or lattice quantization with dither in the vector case, followed by a universal lossless source encoder (entropy coder), is a simple procedure for universal coding with distortion of a source that may take continuously many values. The rate of this universal codi ..."
Abstract

Cited by 48 (15 self)
 Add to MetaCart
Uniform quantization with dither, or lattice quantization with dither in the vector case, followed by a universal lossless source encoder (entropy coder), is a simple procedure for universal coding with distortion of a source that may take continuously many values. The rate of this universal coding scheme is examined, and we derive a general expression for it. An upper bound for the redundancy of this scheme, defined as the difference between its rate and the minimal possible rate, given by the rate distortion function of the source, is derived. This bound holds for all distortion levels. Furthermore, we present a composite upper bound on the redundancy as a function of the quantizer resolution which leads to a tighter bound in the high rate (low distortion) case. Key Words: Uniform and Lattice Quantization, Randomized Quantization, Universal Coding, RateDistortion Performance Meir Feder was also supported by The Andrew W. Mellon Foundation, Woods Hole Oceanographic Institu...
Rates of convergence in the source coding theorem, in empirical quantizer design and in universal lossy source coding
 IEEE Transactions on Information Theory
, 1994
"... AbstructRate of convergence results are established for vector quantization. Convergence rates are given for an increasing vector dimension and/or an increasing training set size. In particular, the following results are shown for memoryless realvalued sources with bounded support at transmission r ..."
Abstract

Cited by 46 (9 self)
 Add to MetaCart
AbstructRate of convergence results are established for vector quantization. Convergence rates are given for an increasing vector dimension and/or an increasing training set size. In particular, the following results are shown for memoryless realvalued sources with bounded support at transmission rate R: (1) If a vector quantizer with fixed dimension k is designed to minimize the empirical meansquare error (MSE) with respect to m training vectors, then its MSE for the true source converges in expectation and almost surely to the minimum possible MSE as O(\llogm/nl; (2) The MSE of an optimal kdimensional vector quantizer for the true source converges, as the dimension grows, to the distortionrate function D(R) as O(J/~); (3) There exists a fixedrate universal lossy source coding scheme whose perletter MSE on n realvalued source samples converges in expectation and almost surely to the
Soft decoding techniques for codes and lattices, including the Golay code and the Leech lattice
 IEEE Trans. Inform. Theory
, 1986
"... AbstrtiTwo kinds of a&orithms are considered. 1) ff 59 is a binary code of length n, a “soft decision ” decodhg afgorithm for Q changes ao arbitrary point of R ” into a nearest codeword (nearest in Euclideao distance). 2) Similarly, a deco&g afgorithm for a lattice A in R ” changes an arbitraq poin ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
AbstrtiTwo kinds of a&orithms are considered. 1) ff 59 is a binary code of length n, a “soft decision ” decodhg afgorithm for Q changes ao arbitrary point of R ” into a nearest codeword (nearest in Euclideao distance). 2) Similarly, a deco&g afgorithm for a lattice A in R ” changes an arbitraq point of R ” into a closest lattice point. Some general methods are given for constructing such algorithnq and arc used to obtain new and faster decoding algorithms for the C&set lattice E,, the Cofay code and the Leech lattice. L I.
Some Extensions of the KMeans Algorithm for Image Segmentation and Pattern Classification
 Technical Report, MIT Artificial Intelligence Laboratory
, 1993
"... In this paper we present some extensions to the kmeans algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorit ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
In this paper we present some extensions to the kmeans algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorithm, it is possible to find the representative centers of the lower dimensional manifolds that define the boundaries between classes, for clouds of multidimensional, multiclass data; this permits one, for example, to find class boundaries directly from sparse data (e.g., in image segmentation tasks) or to efficiently place centers for pattern classification (e.g., with local Gaussian classifiers). The same state variables can be used to define algorithms for determining adaptively the optimal number of centers for clouds of data with spacevarying density. Some examples of the application of these extensions are also given. Copyright c fl Massachusetts Institute of Technology, 1993 This rep...
Adaptive Detection and Localization of Moving Objects in Image Sequences
 SIGNAL PROCESSING: IMAGE COMMUNICATION
, 1999
"... In this paper we address two important problems in motion analysis: the detection of moving objects and their localization. A statistical approach is adopted in order to formulate these problems. For the first, the interframe difference is modelized by a mixture of two zeromean generalized Gauss ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
In this paper we address two important problems in motion analysis: the detection of moving objects and their localization. A statistical approach is adopted in order to formulate these problems. For the first, the interframe difference is modelized by a mixture of two zeromean generalized Gaussian distributions, and a Gibbs random field is used for describing the label set. A new method to determine the regularization parameter is proposed, based on a voting technique. This method is also modelized using a statistical framework. The solution of the second problem is based on the observation of only two successive frames. Using the results of change detection an adaptive statistical model for the couple of image intensities is identified. For each problem two different multiscale algorithms are evaluated, and the labeling problem is solved using either ICM (Iterated Conditional Modes) or HCF (Highest Confidence First) algorithms. For illustrating the efficiency of the proposed approach, experimental results are provided using synthetic and real video sequences.