## Learning models of object structure

### Cached

### Download Links

Citations: | 2 - 1 self |

### BibTeX

@MISC{Schlecht_learningmodels,

author = {Joseph Schlecht and Kobus Barnard},

title = {Learning models of object structure},

year = {}

}

### OpenURL

### Abstract

We present an approach for learning stochastic geometric models of object categories from single view images. We focus here on models expressible as a spatially contiguous assemblage of blocks. Model topologies are learned across groups of images, and one or more such topologies is linked to an object category (e.g. chairs). Fitting learned topologies to an image can be used to identify the object class, as well as detail its geometry. The latter goes beyond labeling objects, as it provides the geometric structure of particular instances. We learn the models using joint statistical inference over category parameters, camera parameters, and instance parameters. These produce an image likelihood through a statistical imaging model. We use trans-dimensional sampling to explore topology hypotheses, and alternate between Metropolis-Hastings and stochastic dynamics to explore instance parameters. Experiments on images of furniture objects such as tables and chairs suggest that this is an effective approach for learning models that encode simple representations of category geometry and the statistics thereof, and support inferring both category and geometry on held out single view images. 1

### Citations

5483 | Distinctive image features from scale-invariant keypoints
- Lowe
(Show Context)
Citation Context ...pearance characteristics and/or part configuration statistics (e.g., [4, 5, 6, 12, 13, 24]). These approaches typically rely on effective descriptors that are somewhat resilient to pose change (e.g., =-=[16]-=-). A second force favoring learning 2D representations is the explosion of readily available images compared with that for 3D structure, and thus treating category learning as statistical pattern reco... |

2385 |
Equations of state calculations by fast computing machines
- Metropolis, Rosenbluth, et al.
- 1953
(Show Context)
Citation Context ...n. For this we use the trans-dimensional sampling framework [8, 9]. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings =-=[21, 10, 1, 18]-=-, with stochastic dynamics [23]. As developed further below, these two methods have complementary strengths for our problem. Importantly, we arrange the sampling so that the hybrid of samplers are gua... |

1298 |
Monte Carlo sampling methods using Markov chains and their application 57
- Hastings
- 1970
(Show Context)
Citation Context ...n. For this we use the trans-dimensional sampling framework [8, 9]. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings =-=[21, 10, 1, 18]-=-, with stochastic dynamics [23]. As developed further below, these two methods have complementary strengths for our problem. Importantly, we arrange the sampling so that the hybrid of samplers are gua... |

900 | Object class recognition by unsupervised scaleinvariant learning
- Fergus, Perona, et al.
- 2003
(Show Context)
Citation Context ...explored, given enough time. Related work. Most work on learning representations for object categories has focused on imagebased appearance characteristics and/or part configuration statistics (e.g., =-=[4, 5, 6, 12, 13, 24]-=-). These approaches typically rely on effective descriptors that are somewhat resilient to pose change (e.g., [16]). A second force favoring learning 2D representations is the explosion of readily ava... |

868 | Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika 82
- Green
- 1995
(Show Context)
Citation Context ...hood of the data from multiple examples. Because we are searching for model topologies, we need to search among models with varying dimension. For this we use the trans-dimensional sampling framework =-=[7, 8]-=-. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings [1, 14], with stochastic dynamics [18]. As developed further below... |

782 | Recognition-by-components: A theory of human image understanding
- Biederman
- 1987
(Show Context)
Citation Context ...ct topologies [11, 26] and structure models for 2D images of objects constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts =-=[2, 3, 28]-=- and detecting objects in images given a precise 3D model [10, 15, 25], such as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable model... |

584 | Probabilistic Inference Using Markov Chain Monte Carlo Methods, technical report CRG-TR-93-1, University of Toronto, Dept. of Computer Science. (Available from the author’s home page, at http://www.cs.toronto.edu/~radford
- Neal
- 1993
(Show Context)
Citation Context ...sional sampling framework [7, 8]. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings [1, 14], with stochastic dynamics =-=[18]-=-. As developed further below, these two methods have complementary strengths for our problem. Importantly, we arrange the sampling so that the hybrid of samplers are guaranteed to converge to the post... |

503 | Comparing images using the Hausdorff Distance
- HUTTENLOCHER, Klanderman
- 1993
(Show Context)
Citation Context ...6) where Nbg and Nmiss are the number of background and missing detection responses in the image, and Nbg + Nmiss = ∑Nx i=1 1 − Ei. Our approach has some similarities to standard edge matching (e.g., =-=[3, 14]-=-), but we explain the edge points as the result of a generative statistical process that accounts for both distance and gradient direction. Using the Hausdorff distance for edges in our approach, for ... |

492 | Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories
- Fei-Fei, Fergus, et al.
- 2004
(Show Context)
Citation Context ...explored, given enough time. Related work. Most work on learning representations for object categories has focused on imagebased appearance characteristics and/or part configuration statistics (e.g., =-=[4, 5, 6, 12, 13, 24]-=-). These approaches typically rely on effective descriptors that are somewhat resilient to pose change (e.g., [16]). A second force favoring learning 2D representations is the explosion of readily ava... |

458 |
Pattern recognition and machine learning
- Bishop
- 2006
(Show Context)
Citation Context ...e uniformly distributed. 3.1. Transdimensional sampling The Metroplis-Hastings (MH) algorithm is an MCMC sampling technique to generate unbiased and representative samples from a target distribution =-=[21, 10, 23, 7, 2]-=-. The central concept of the algorithm is to propose samples from a distribution q(θ ′ | θ), which can be easily sampled, and accept or reject the samples with probability ( α ˜θ (n) ) { = min 1, p(˜ ... |

423 |
Monte Carlo Strategies in Scientific Computing
- Liu
- 2001
(Show Context)
Citation Context ...n. For this we use the trans-dimensional sampling framework [7, 8]. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings =-=[1, 14]-=-, with stochastic dynamics [18]. As developed further below, these two methods have complementary strengths for our problem. Importantly, we arrange the sampling so that the hybrid of samplers are gua... |

317 |
Learning structural descriptions from examples
- Winston
- 1975
(Show Context)
Citation Context ...ct topologies [11, 26] and structure models for 2D images of objects constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts =-=[2, 3, 28]-=- and detecting objects in images given a precise 3D model [10, 15, 25], such as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable model... |

302 | Fitting parameterized three-dimensional models to images
- Lowe
- 1991
(Show Context)
Citation Context ... constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecting objects in images given a precise 3D model =-=[10, 15, 25]-=-, such as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable models of known topology to 2D images in the case of human pose estimation ... |

259 |
Recognizing solid objects by alignment with an image
- Huttenlocher, Ullman
- 1990
(Show Context)
Citation Context ... constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecting objects in images given a precise 3D model =-=[10, 15, 25]-=-, such as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable models of known topology to 2D images in the case of human pose estimation ... |

258 |
Hierarchical chamfer matching: A parametric edge matching algorithm
- Borgefors
- 1988
(Show Context)
Citation Context ...6) where Nbg and Nmiss are the number of background and missing detection responses in the image, and Nbg + Nmiss = ∑Nx i=1 1 − Ei. Our approach has some similarities to standard edge matching (e.g., =-=[3, 14]-=-), but we explain the edge points as the result of a generative statistical process that accounts for both distance and gradient direction. Using the Hausdorff distance for edges in our approach, for ... |

242 | An introduction to MCMC for machine learning
- Andrieu, Freitas, et al.
- 2003
(Show Context)
Citation Context ...n. For this we use the trans-dimensional sampling framework [7, 8]. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings =-=[1, 14]-=-, with stochastic dynamics [18]. As developed further below, these two methods have complementary strengths for our problem. Importantly, we arrange the sampling so that the hybrid of samplers are gua... |

228 | Image segmentation by data-driven Markov chain Monte Carlo
- Tu, Zhu
- 2002
(Show Context)
Citation Context ...ximizes it. Since the number of parameters in the sampling space is a unknown, some proposals must change the model dimension. In particular, these jump moves (following the terminology of Tu and Zhu =-=[27]-=-) arise from changes in topology. Diffusion moves make changes to parameters within a given topology. We cycle between the two kinds of moves. Diffusion moves for sampling within topology. We found th... |

199 | M.: Putting objects in perspective
- Hoiem, Efros, et al.
- 2008
(Show Context)
Citation Context ...driven largely by appearance cues mapped onto the model. Their choice modeling in 3D simplifies a number of issues, and provides for more natural integration with work in understanding scene geometry =-=[11]-=-, as is the case for us. However, our modeling approach is different in that we focus in on learning topologies for assemblages of parameterized parts, instead of working with deformation of a single ... |

184 |
On seeing things
- Clowes
- 1971
(Show Context)
Citation Context ...ct topologies [11, 26] and structure models for 2D images of objects constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts =-=[2, 3, 28]-=- and detecting objects in images given a precise 3D model [10, 15, 25], such as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable model... |

151 | Learning hierarchical models of scenes, objects, and parts
- Sudderth, Torralba, et al.
- 2005
(Show Context)
Citation Context ...explored, given enough time. Related work. Most work on learning representations for object categories has focused on imagebased appearance characteristics and/or part configuration statistics (e.g., =-=[4, 5, 6, 12, 13, 24]-=-). These approaches typically rely on effective descriptors that are somewhat resilient to pose change (e.g., [16]). A second force favoring learning 2D representations is the explosion of readily ava... |

108 | B.: Kinematic jump processes for monocular 3d human tracking
- Sminchisescu, Triggs
(Show Context)
Citation Context ...as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable models of known topology to 2D images in the case of human pose estimation (e.g., =-=[17, 22, 23]-=-). 2 Modeling object category structure We use a generative model for image features corresponding to examples from object categories (Fig. 1). A category is associated with a sampling from category l... |

92 | Estimating articulated human motion with covariance scaled sampling
- Sminchisescu, Triggs
- 2003
(Show Context)
Citation Context ...as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable models of known topology to 2D images in the case of human pose estimation (e.g., =-=[17, 22, 23]-=-). 2 Modeling object category structure We use a generative model for image features corresponding to examples from object categories (Fig. 1). A category is associated with a sampling from category l... |

89 | Theory-based Bayesian models of inductive learning and reasoning
- Tenenbaum, Griffiths, et al.
- 2006
(Show Context)
Citation Context ...y proposed by Hoeim et al. [9] is to fit a deformable 3D blob to cars, driven largely by appearance cues mapped onto the model. Our work also relates to recent efforts in learning abstract topologies =-=[11, 26]-=- and structure models for 2D images of objects constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecti... |

87 | Fei-Fei: 3D generic object categorization, localization and pose estimation, in
- Savarese, L
- 2007
(Show Context)
Citation Context ...tern recognition is more convenient in the data domain (2D images). However, some researchers have started imposing more projective geometry into the spatial models. For example, Savarese and Fei-Fei =-=[19, 20]-=- build a model where arranged parts are linked by a fundamental matrix. Their training process is helped by multiple examples of the same objects, but notably they are able to use training data with c... |

84 | A stochastic grammar of images
- Zhu, Mumford
(Show Context)
Citation Context ...rance cues mapped onto the model. Our work also relates to recent efforts in learning abstract topologies [11, 26] and structure models for 2D images of objects constrained by grammar representations =-=[29, 30]-=-. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecting objects in images given a precise 3D model [10, 15, 25], such as one for machined parts in ... |

68 |
The discovery of structural form
- Kemp, Tenenbaum
- 2008
(Show Context)
Citation Context ...y proposed by Hoeim et al. [9] is to fit a deformable 3D blob to cars, driven largely by appearance cues mapped onto the model. Our work also relates to recent efforts in learning abstract topologies =-=[11, 26]-=- and structure models for 2D images of objects constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecti... |

59 | Trans-dimensional Markov chain Monte Carlo, In: Highly Structured Stochastic Systems (Ed-s: P.J
- Green
- 2005
(Show Context)
Citation Context ...hood of the data from multiple examples. Because we are searching for model topologies, we need to search among models with varying dimension. For this we use the trans-dimensional sampling framework =-=[7, 8]-=-. We explore the posterior space within a given probability space of a particular dimension by combining standard Metropolis-Hastings [1, 14], with stochastic dynamics [18]. As developed further below... |

53 | Beyond local appearance: Category recognition from pairwise interactions of simple features
- Leordeanu, Hebert, et al.
(Show Context)
Citation Context |

53 | Recovering 3d human body configurations using shape contexts
- Mori, Malik
- 2006
(Show Context)
Citation Context ...as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable models of known topology to 2D images in the case of human pose estimation (e.g., =-=[17, 22, 23]-=-). 2 Modeling object category structure We use a generative model for image features corresponding to examples from object categories (Fig. 1). A category is associated with a sampling from category l... |

47 | 3d layout crf for multi-view object class recognition and segmentation
- Hoiem, Rother, et al.
- 2007
(Show Context)
Citation Context ... Their approach is different than ours in that models are built more bottom up, and this process is somewhat reliant on the presence of surface textures. A different strategy proposed by Hoeim et al. =-=[9]-=- is to fit a deformable 3D blob to cars, driven largely by appearance cues mapped onto the model. Our work also relates to recent efforts in learning abstract topologies [11, 26] and structure models ... |

39 | Unsupervised learning of a probabilistic grammar for object detection and parsing. Advances in neural information processing systems
- Zhu, Chen, et al.
- 1617
(Show Context)
Citation Context ...rance cues mapped onto the model. Our work also relates to recent efforts in learning abstract topologies [11, 26] and structure models for 2D images of objects constrained by grammar representations =-=[29, 30]-=-. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecting objects in images given a precise 3D model [10, 15, 25], such as one for machined parts in ... |

38 |
Flexible object models for category-level 3d object recognition, in
- Kushal, Schmid, et al.
- 2007
(Show Context)
Citation Context |

17 |
A Necessary and Sufficient Condition for a Picture to Represent a Polyhedral Scene
- Sugihara
- 1984
(Show Context)
Citation Context ... constrained by grammar representations [29, 30]. Also relevant is a large body of older work on representing objects with 3D parts [2, 3, 28] and detecting objects in images given a precise 3D model =-=[10, 15, 25]-=-, such as one for machined parts in an industrial setting. Finally, we have also been inspired by work on fitting deformable models of known topology to 2D images in the case of human pose estimation ... |

16 | The Joy of Sampling
- Forsyth, Haddon, et al.
(Show Context)
Citation Context ...e uniformly distributed. 3.1. Transdimensional sampling The Metroplis-Hastings (MH) algorithm is an MCMC sampling technique to generate unbiased and representative samples from a target distribution =-=[21, 10, 23, 7, 2]-=-. The central concept of the algorithm is to propose samples from a distribution q(θ ′ | θ), which can be easily sampled, and accept or reject the samples with probability ( α ˜θ (n) ) { = min 1, p(˜ ... |

13 | View synthesis for recognizing unseen poses of object classes
- Savarese, Fei-Fei
(Show Context)
Citation Context ...tern recognition is more convenient in the data domain (2D images). However, some researchers have started imposing more projective geometry into the spatial models. For example, Savarese and Fei-Fei =-=[19, 20]-=- build a model where arranged parts are linked by a fundamental matrix. Their training process is helped by multiple examples of the same objects, but notably they are able to use training data with c... |

3 |
Weakly-supervised learning of partbased spatial models for visual object recognition. ECCV
- Crandall, Huttenlocher
- 2006
(Show Context)
Citation Context |