Results 1  10
of
58
A shapebased approach to the segmentation of medical imagery using level sets
 IEEE Trans. Med. Imag
, 2003
"... Abstract—We propose a shapebased approach to curve evolution for the segmentation of medical images containing known object types. In particular, motivated by the work of Leventon, Grimson, and Faugeras [15], we derive a parametric model for an implicit representation of the segmenting curve by app ..."
Abstract

Cited by 130 (11 self)
 Add to MetaCart
Abstract—We propose a shapebased approach to curve evolution for the segmentation of medical images containing known object types. In particular, motivated by the work of Leventon, Grimson, and Faugeras [15], we derive a parametric model for an implicit representation of the segmenting curve by applying principal component analysis to a collection of signed distance representations of the training data. The parameters of this representation are then manipulated to minimize an objective function for segmentation. The resulting algorithm is able to handle multidimensional data, can deal with topological changes of the curve, is robust to noise and initial contour placements, and is computationally efficient. At the same time, it avoids the need for point correspondences during the training phase of the algorithm. We demonstrate this technique by applying it to two medical applications; twodimensional segmentation of cardiac magnetic resonance imaging (MRI) and threedimensional segmentation of prostate MRI. Index Terms—Active contours, binary image alignment, cardiac MRI segmentation, curve evolution, deformable model, distance transforms, eigenshapes, implicit shape representation, medical image segmentation, parametric shape model, principal component analysis, prostate segmentation, shape prior, statistical shape model. I.
Learning from one example through shared densities on transforms
 In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 2000
"... We define a process called congealing in which elements of a dataset (images) are brought into correspondence with each other jointly, producing a datadefined model. It is based upon minimizing the summed componentwise (pixelwise) entropies over a continuous set of transforms on the data. One of t ..."
Abstract

Cited by 97 (7 self)
 Add to MetaCart
We define a process called congealing in which elements of a dataset (images) are brought into correspondence with each other jointly, producing a datadefined model. It is based upon minimizing the summed componentwise (pixelwise) entropies over a continuous set of transforms on the data. One of the biproducts of this minimization is a set of transforms, one associated with each original training sample. We then demonstrate a procedure for effectively bringing test data into correspondence with the datadefined model produced in the congealing process. Subsequently, we develop a probability density over the set of transforms that arose from the congealing process. We suggest that this density over transforms may be shared by many classes, and demonstrate how using this density as “prior knowledge ” can be used to develop a classifier based on only a single training example for each class. 1
Data driven image models through continuous joint alignment
 PAMI
, 2006
"... This paper presents a family of techniques that we call congealing for modeling image classes from data. The idea is to start with a set of images and make them appear as similar as possible by removing variability along the known axes of variation. This technique can be used to eliminate “nuisance ..."
Abstract

Cited by 65 (4 self)
 Add to MetaCart
This paper presents a family of techniques that we call congealing for modeling image classes from data. The idea is to start with a set of images and make them appear as similar as possible by removing variability along the known axes of variation. This technique can be used to eliminate “nuisance” variables such as affine deformations from handwritten digits or unwanted bias fields from magnetic resonance images. In addition to separating and modeling the latent images—i.e., the images without the nuisance variables—we can model the nuisance variables themselves, leading to factorized generative image models. When nuisance variable distributions are shared between classes, one can share the knowledge learned in one task with another task, leading to efficient learning. We demonstrate this process by building a handwritten digit classifier from just a single example of each class. In addition to applications in handwritten character recognition, we describe in detail the application of bias removal from magnetic resonance images. Unlike previous methods, we use a separate, nonparametric model for the intensity values at each pixel. This allows us to leverage the data from the MR images of different patients to remove bias from each other. Only very weak assumptions are made about the distributions of intensity values in the images. In addition to the digit and MR applications, we discuss a number of other uses of congealing and describe experiments about the robustness and consistency of the method.
Transformationinvariant clustering using the EM algorithm
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2003
"... Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and ..."
Abstract

Cited by 60 (12 self)
 Add to MetaCart
Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.
Automatic Construction of Active Appearance Models as an Image Coding Problem
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... The automatic construction of Active Appearance Models (AAMs) is usually posed as finding the location of the base mesh vertices in the input training images. In this paper, we repose the problem as an energyminimizing image coding problem and propose an efficient gradientdescent algorithm to s ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
The automatic construction of Active Appearance Models (AAMs) is usually posed as finding the location of the base mesh vertices in the input training images. In this paper, we repose the problem as an energyminimizing image coding problem and propose an efficient gradientdescent algorithm to solve it.
A comparison of algorithms for inference and learning in probabilistic graphical models
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Computer vision is currently one of the most exciting areas of artificial intelligence research, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern classification problems such as handwr ..."
Abstract

Cited by 54 (4 self)
 Add to MetaCart
Computer vision is currently one of the most exciting areas of artificial intelligence research, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition and face detection, it is even more exciting that researchers may be on the verge of introducing computer vision systems that perform scene analysis, decomposing image input into its constituent objects, lighting conditions, motion patterns, and so on. Two of the main challenges in computer vision are finding efficient models of the physics of visual scenes and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graphbased probability models and their associated inference and learning algorithms for computer vision and scene analysis. We review exact techniques and various approximate, computationally efficient techniques, including iterative conditional modes, the expectation maximization (EM) algorithm, the mean field method, variational techniques, structured variational techniques, Gibbs sampling, the sumproduct algorithm and “loopy ” belief propagation. We describe how each technique can be applied in a model of multiple, occluding objects, and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Stochastic rigidity: Image registration for nowherestatic scenes.
, 2001
"... We consider the registration of sequences of images where the observed scene is entirely nonrigid; for example a camera flying over water, a panning shot of a field of sunflowers in the wind, or footage of a crowd applauding at a sports event. In these cases, it is not possible to impose the constr ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
We consider the registration of sequences of images where the observed scene is entirely nonrigid; for example a camera flying over water, a panning shot of a field of sunflowers in the wind, or footage of a crowd applauding at a sports event. In these cases, it is not possible to impose the constraint that world points have similar colour in successive views, so existing registration techniques [1, 5, 9, 11] cannot be applied. Indeed the relationship between a point's colours in successive frames is essentially a random process. However, by treating the sequence of images as a set of samples from a multidimensional stochastic timeseries, we can learn a stochastic model (e.g. an AR model [16, 23]) of the random process which generated the sequence of images. With a static camera, this stochastic model can be used to extend the sequence arbitrarily in time: driving the model with random noise results in an infinitely varying sequence of images which always looks like the short input sequence. In this way, we can create "videotextures" [21, 24] which can play forever without repetition. With a moving camera, the image generation process comprises two componentsa stochastic component generated by the videotexture, and a parametric component due to the camera motion. For example, a camera rotation induces a relationship between successive images which is modelled by a 4point perspective transformation, or homography. Human observers can easily separate the camera motion from the stochastic element. The key observation for an automatic implementation is that without image registration, the timeseries analysis must work harder to model the combined stochastic and parametric image generation. Specifically, the learned model will require more components, or more coeffi...
Modeling, clustering, and segmenting video with mixtures of dynamic textures
 PAMI
, 2008
"... A dynamic texture is a spatiotemporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of v ..."
Abstract

Cited by 45 (13 self)
 Add to MetaCart
A dynamic texture is a spatiotemporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectationmaximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, timeseries clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (for example, fire, steam, water, vehicle and pedestrian traffic, and so forth). When compared with stateoftheart methods in motion segmentation, including both temporal texture methods and traditional representations (for example, optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.
Transformed Hidden Markov Models: Estimating Mixture Models of Images and Inferring Spatial Transformations in Video Sequences
 In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 2000
"... Submitted to the IEEE Conference on Computer Vision and Pattern Recognition, 2000. In this paper we describe a novel generative model for video analysis called the transformed hidden Markov model (THMM). The video sequence is modeled as a set of frames generated by transforming a small number of cl ..."
Abstract

Cited by 42 (10 self)
 Add to MetaCart
Submitted to the IEEE Conference on Computer Vision and Pattern Recognition, 2000. In this paper we describe a novel generative model for video analysis called the transformed hidden Markov model (THMM). The video sequence is modeled as a set of frames generated by transforming a small number of class images that summarize the sequence. For each frame, the transformation and the class are discrete latent variables that depend on the previous class and transformation in the sequence. The set of possible transformations is de ned in advance, and it can include a variety of transformation such as translation, rotation and shearing. In each stage of such a Markov model, a new frame is generated from a transformed Gaussian distribution based on the class/transformation combination generated by the Markov chain. This model can be viewed as an extension of a transformed mixture of Gaussians [1] through time. We use this model to cluster unlabeled video segments and form a video summary in ...