Results 1  10
of
48
Putting objects in perspective
 In CVPR
, 2006
"... Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface ..."
Abstract

Cited by 200 (13 self)
 Add to MetaCart
(Show Context)
Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and viceversa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding. Our results confirm the benefits of our integrated approach. 1.
Geometric Context from a Single Image
 In ICCV
, 2005
"... Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearancebased models of geometric classes, even in cluttered natural scenes. Geometric classe ..."
Abstract

Cited by 180 (33 self)
 Add to MetaCart
(Show Context)
Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearancebased models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiplehypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic singleview reconstruction.
A Stochastic Grammar of Images
 Foundations and Trends in Computer Graphics and Vision
, 2006
"... This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents bot ..."
Abstract

Cited by 85 (20 self)
 Add to MetaCart
(Show Context)
This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents both the hierarchical decompositions from scenes, to objects, parts, primitives and pixels by terminal and nonterminal nodes and the contexts for spatial and functional relations by horizontal links between the nodes. It formulates each object category as the set of all possible valid configurations produced by the grammar. (ii) The grammar is embodied in a simple And–Or graph representation where each Ornode points to alternative subconfigurations and an Andnode is decomposed into a number of components. This representation supports recursive topdown/bottomup procedures for image parsing under the Bayesian framework and make it convenient to scale
Fast and Robust Segmentation of Natural Color Scenes
 In Proceedings of the 3 rd Asian Conference on Computer Vision
, 1997
"... This paper describes our entire color segmentation system, called CSC (Color Structure Code), in detail. In section 2 we introduce the hexagonal, hierarchical island structure on which our method is based. Section 3 describes the actual segmentation method. In Section 4 the new color similarity meas ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
(Show Context)
This paper describes our entire color segmentation system, called CSC (Color Structure Code), in detail. In section 2 we introduce the hexagonal, hierarchical island structure on which our method is based. Section 3 describes the actual segmentation method. In Section 4 the new color similarity measure is presented. Section 5 discusses the complexity of our approach. The system is very fast and thus applicable in real world problems. Finally we present some results and conclusions in section 6. 2 Hexagonal, hierarchical island structure
BottomUp/TopDown Image Parsing with Attribute Grammar
"... Abstract—This paper presents a simple attribute graph grammar as a generative representation for manmade scenes such as buildings, hallways, kitchens, and living rooms and studies an effective topdown/bottomup inference algorithm for parsing images in the process of maximizing a Bayesian posterio ..."
Abstract

Cited by 22 (5 self)
 Add to MetaCart
(Show Context)
Abstract—This paper presents a simple attribute graph grammar as a generative representation for manmade scenes such as buildings, hallways, kitchens, and living rooms and studies an effective topdown/bottomup inference algorithm for parsing images in the process of maximizing a Bayesian posterior probability or equivalently minimizing a description length (MDL). This simple grammar has one class of primitives as its terminal nodes, i.e., the projection of planar rectangles in 3space into the image plane, and six production rules for the spatial layout of the rectangular surfaces. All of the terminal and nonterminal nodes in the grammar are described by attributes for their geometric properties and image appearance. Each production rule is associated with some equations that constrain the attributes of a parent node and those of its children. Given an input image, the inference algorithm computes (or constructs) a parse graph, which includes a parse tree for the hierarchical decomposition and a number of spatial constraints. In the inference algorithm, the bottomup step detects an excessive number of rectangles as weighted candidates, which are sorted in a certain order and activate topdown predictions of occluded or missing components through the grammar rules. The whole procedure is, in spirit, similar to the datadriven Markov chain Monte Carlo paradigm [39], [33], except that a greedy algorithm is adopted for simplicity. In the experiment, we show that the grammar and topdown inference can largely improve the performance of bottomup detection. Index Terms—Attribute graph grammar, bottomup/topdown, image parsing, primal sketch, generative model. Ç 1
ELVIS: Eigenvectors for Land Vehicle Image System
 Proceedings of the International Conference on Intelligent Robots and Systems. 'Human Robot Interaction and Cooperative Robots' (IROS '95
, 1995
"... ..."
Image interpretation Using Bayesian Networks
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1996
"... The problem of image interpretation is one of inference with the help of domain knowledge. In this correspondence, we formulate the problem as the maximum a posteriori (MAP) estimate of a properly defined probability distribution function. We show that a Bayesian network can be used to represent thi ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
The problem of image interpretation is one of inference with the help of domain knowledge. In this correspondence, we formulate the problem as the maximum a posteriori (MAP) estimate of a properly defined probability distribution function. We show that a Bayesian network can be used to represent this p.d.f. as well as the domain knowledge needed for interpretation. The Bayesian network may be relaxed to obtain the set of optimum interpretations.
3d vision techniques for autonomous vehicles
, 1988
"... those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the funding agencies. 4 Contents ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the funding agencies. 4 Contents
FPGA Implementation of an AdaptableSize Neural Network
 In Proceedings of the International Conference on Artificial Neural Networks ICANN96
, 1996
"... . Artificial neural networks achieve fast parallel processing via massively parallel nonlinear computational elements. Most neural network models base their ability to adapt to problems on changing the strength of the interconnections between computational elements according to a given learning alg ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
. Artificial neural networks achieve fast parallel processing via massively parallel nonlinear computational elements. Most neural network models base their ability to adapt to problems on changing the strength of the interconnections between computational elements according to a given learning algorithm. However, constrained interconnection structures may limit such ability. Field programmable hardware devices allow the implementation of neural networks with incircuit structure adaptation. This paper describes an FPGA implementation of the FAST (Flexible AdaptableSize Topology) architecture, a neural network that dynamically changes its size. Since initial experiments indicated a good performance on pattern clustering tasks, we have applied our dynamicstructure FAST neural network to an image segmentation and recognition problem. 1 Introduction Artificial neural network models offer an attractive paradigm: learning to solve problems from examples. Most neural network models base ...
Feature Representation and Signal Classification in Fluorescence InSitu Hybridization Image Analysis
 IEEE Trans. Syst. Man Cybernet. A
, 2001
"... Fast and accurate analysis of fluorescence insitu hybridization (FISH) images for signal counting will depend mainly upon two components: a classifier to discriminate between artifacts and valid signals of several fluorophores (colors), and well discriminating features to represent the signals. Our ..."
Abstract

Cited by 14 (9 self)
 Add to MetaCart
Fast and accurate analysis of fluorescence insitu hybridization (FISH) images for signal counting will depend mainly upon two components: a classifier to discriminate between artifacts and valid signals of several fluorophores (colors), and well discriminating features to represent the signals. Our previous work has focused on the first component. To investigate the second component, we evaluate candidate feature sets by illustrating the probability density functions (pdfs) and scatter plots for the features. The analysis provides first insight into dependencies between features, indicates the relative importance of members of a feature set, and helps in identifying sources of potential classification errors. Class separability yielded by different feature subsets is evaluated using the accuracy of several neural network (NN)based classification strategies, some of them hierarchical, as well as using a feature selection technique making use of a scatter criterion. The complete analysis recommends several intensity and hue features for representing FISH signals. Represented by these features, around 90% of valid signals and artifacts of two fluorophores are correctly classified using the NN. Although applied to cytogenetics, the paper presents a comprehensive, unifying methodology of qualitative and quantitative evaluation of pattern feature representation essential for accurate image classification. This methodology is applicable to many other realworld pattern recognition problems.