Results 1 - 10
of
41
Actions as space-time shapes
- In ICCV
, 2005
"... Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach [14] for analyzing 2D shapes and genera ..."
Abstract
-
Cited by 192 (3 self)
- Add to MetaCart
Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach [14] for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes. Our method utilizes properties of the solution to the Poisson equation to extract space-time features such as local space-time saliency, action dynamics, shape structure and orientation. We show that these features are useful for action recognition, detection and clustering. The method is fast, does not require video alignment and is applicable in (but not limited to) many scenarios where the background is known. Moreover, we demonstrate the robustness of our method to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action, and low quality video. Index Terms Action representation, action recognition, space-time analysis, shape analysis, poisson equation
A survey on visual surveillance of object motion and behaviors
- IEEE Transactions on Systems, Man and Cybernetics
, 2004
"... Abstract—Visual surveillance in dynamic scenes, especially for humans and vehicles, is currently one of the most active research topics in computer vision. It has a wide spectrum of promising applications, including access control in special areas, human identification at a distance, crowd flux stat ..."
Abstract
-
Cited by 123 (2 self)
- Add to MetaCart
Abstract—Visual surveillance in dynamic scenes, especially for humans and vehicles, is currently one of the most active research topics in computer vision. It has a wide spectrum of promising applications, including access control in special areas, human identification at a distance, crowd flux statistics and congestion analysis, detection of anomalous behaviors, and interactive surveillance using multiple cameras, etc. In general, the processing framework of visual surveillance in dynamic scenes includes the following stages: modeling of environments, detection of motion, classification of moving objects, tracking, understanding and description of behaviors, human identification, and fusion of data from multiple cameras. We review recent developments and general strategies of all these stages. Finally, we analyze possible research directions, e.g., occlusion handling, a combination of twoand three-dimensional tracking, a combination of motion analysis and biometrics, anomaly detection and behavior prediction, content-based retrieval of surveillance videos, behavior understanding and natural language description, fusion of information from multiple sensors, and remote surveillance. Index Terms—Behavior understanding and description, fusion of data from multiple cameras, motion detection, personal identification, tracking, visual surveillance.
An Algorithmic Overview of Surface Registration . . .
- MEDICAL IMAGE ANALYSIS
, 2000
"... This paper presents a literature survey of automatic 3D surface registration techniques emphasizing the mathematical and algorithmic underpinnings of the subject. The relevance of surface registration to medical imaging is that there is much useful anatomical information in the form of collected ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
This paper presents a literature survey of automatic 3D surface registration techniques emphasizing the mathematical and algorithmic underpinnings of the subject. The relevance of surface registration to medical imaging is that there is much useful anatomical information in the form of collected surface points which originate from complimentary modalities and which must be reconciled. Surface registration
Parts-based 3D object classification
, 2004
"... This paper presents a parts-based method for classifying scenes of 3D objects into a set of pre-determined object classes. Working at the part level, as opposed to the whole object level, enables a more flexible class representation and allows scenes in which the query object is significantly occlud ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This paper presents a parts-based method for classifying scenes of 3D objects into a set of pre-determined object classes. Working at the part level, as opposed to the whole object level, enables a more flexible class representation and allows scenes in which the query object is significantly occluded to be classified. In our approach, parts are extracted from training objects and grouped into part classes using a hierarchical clustering algorithm. Each part class is represented as a collection of semi-local shape features and can be used to perform part class recognition. A mapping from part classes to object classes is derived from the learned part classes and known object classes. At run-time, a 3D query scene is sampled, local shape features are computed, and the object class is determined using the learned part classes and the part-to-object mapping. The approach is demonstrated by classifying novel 3D scenes of vehicles into eight classes.
3-D Computer Vision Using Structured Light: Design, Calibration and Implementation Issues
- Design, Calibration, and Implementation Issues,” Advances in Computers(43
, 1996
"... Structured Light (SL) sensing is a well established method of range acquisition for Computer Vision. This chapter provides thorough discussions of design issues, calibration methodologies and implementation schemes for SL sensors. The challenges for SL sensor development are described and a range of ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
Structured Light (SL) sensing is a well established method of range acquisition for Computer Vision. This chapter provides thorough discussions of design issues, calibration methodologies and implementation schemes for SL sensors. The challenges for SL sensor development are described and a range of approaches are surveyed. A novel SL sensor, PRIME, the PRofile Imaging ModulE has recently been developed and is used as a design example in the detailed discussions. KEYWORDS: Computer Vision,Range Image Acquisition, Structured Light Ranging, Real-Time Machine Vision, Sensor Calibration 0y This research is sponsored in part by grants awarded by the Japan Railways and the Office of Technology Development, U.S. Department of Energy. 1 Introduction Machine vision as a discipline and technology owes its creation, development and growth to digital computers. Without computers machine vision is not possible. The main objective of machine vision is to extract information useful for performin...
Part Decomposition and Description of 3D Shapes
- In Proceedings of the 12th International Conference on Pattern Recognition, volume I
, 1994
"... We address the problem of obtaining natural (intuitive) descriptions of 3D shapes. We present one of the first attempts to address the description of 3D compound objects, where the parts are connected smoothly. The input we consider is either complete 3D data or range data from a single view. We sug ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
We address the problem of obtaining natural (intuitive) descriptions of 3D shapes. We present one of the first attempts to address the description of 3D compound objects, where the parts are connected smoothly. The input we consider is either complete 3D data or range data from a single view. We suggest a volumetric graph representation of the object, where the nodes represent individual parts and the edges represent connectivity information. We suggest the use of properties of the parabolic curves for performing the part decomposition. We currently consider parts with tubular structure with a straight or curved axis. The graph description presents a structural description of the shape in terms of parts and their arrangement. We are also interested in the internal description of the parts. We study two well defined classes of shapes, namely Straight Homogeneous Generalized Cylinders, and Planar Right Constant GCs. We suggest the use of properties of the parabolic curves for recovering natural descriptions of these classes in terms of their cross sections and axes. 1
A similarity-based aspect-graph approach to 3d object recognition
- International Journal of Computer Vision
, 2004
"... Abstract. This paper describes a view-based method for recognizing 3D objects from 2D images. We employ an aspect-graph structure, where the aspects are not based on the singularities of visual mapping but are instead formed using a notion of similarity between views. Specifically, the viewing spher ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Abstract. This paper describes a view-based method for recognizing 3D objects from 2D images. We employ an aspect-graph structure, where the aspects are not based on the singularities of visual mapping but are instead formed using a notion of similarity between views. Specifically, the viewing sphere is endowed with a metric of dis-similarity for each pair of views and the problem of aspect generation is viewed as a ”segmentation ” of the viewing sphere into homogeneous regions. The viewing sphere is sampled at regular (5 degree) intervals and an iterative procedure is used to combine views using the metric into aspects with a prototype representing each aspect, in a ”region-growing ” regime which stands in contrast to the usual ”edge detection ” styles to computing the aspect graph. The aspect growth is constrained such that two aspects of an object remain distinct under the given similarity metric. Once the database of 3D objects is organized as a set of aspects and prototypes for these aspects for each object, unknown views of database objects are compared with the prototypes and the results are ordered by similarity. We use two similarity metrics for shape, one based on curve matching and the other based on matching shock graphs, which for a database of 64 objects and unknown views of objects for the database give (90.3%, 74.2%, 59.7%) and (95.2%, 69.0%, 57.5%), respectively, for the top three matches; identification based on the top three matches is 98 % and 100%, respectively. The result of indexing unknown views of objects not in the database also produce intuitive matches. We also develop a hierarchical indexing scheme the goal of which is to prune unlikely objects at an early stage to improve the efficiency of indexing, resulting in savings of 35 % at the top level and of 55 % at the next level, cumulatively. 1.
Toward 3D Vision from Range Images: An Optimization Framework and Parallel Networks
"... We propose a unified approach to solve low, intermediate and high level computer vision problems for 3D object recognition from range images. All three levels of computation are cast in an optimization framework and can be implemented on neural network style architecture. In the low level computatio ..."
Abstract
-
Cited by 15 (10 self)
- Add to MetaCart
We propose a unified approach to solve low, intermediate and high level computer vision problems for 3D object recognition from range images. All three levels of computation are cast in an optimization framework and can be implemented on neural network style architecture. In the low level computation, the tasks are to estimate curvature images from the input range data. Subsequent processing at the intermediate level is concerned with segmenting these curvature images into coherent curvature sign maps. In the high level, image features are matched against model features based on an object description called attributed relational graph (ARG). We show that the above computational tasks at each of the three different levels can all be formulated as optimizing a two-term energy function. The first term encodes unary constraints while the second term binary ones. These energy functions are minimized using parallel and distributed relaxation-based algorithms which are well suited for neural...
Fusion Through Interpretation
, 1992
"... We investigate the use of interpretation trees to solve the correspondence problem for a mobile robot fusing data from a range image into a world model consisting of planar surface patches. Uncertainty is handled by stochastic techniques where errors are represented by normal joint probability distr ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
We investigate the use of interpretation trees to solve the correspondence problem for a mobile robot fusing data from a range image into a world model consisting of planar surface patches. Uncertainty is handled by stochastic techniques where errors are represented by normal joint probability distributions. We show that for problems of a typical size the search time is too long unless the world model can be structured into parts only one of which can be occupied by the robot at any given moment. 1 Introduction This paper is concerned with the correspondence problem within the context of data fusion for a mobile robot. We restrict our attention to the case of a single range imaging sensor delivering planar surface patch features which are to be fused into a world model which also consists of planar patches. A separate problem, which we do not cover here, is how to update the world model once the correspondences have been established, a problem which is not easy because of partial occl...
The Space Envelope: A Representation for 3D Scenes
"... This work introduces the space envelope, a shape model based upon a boundary description. Instead of modeling a single object (or solid), a space envelope encloses a volume of empty space. The advantage is that in any given view, there may be any number of objects, this number being difficult to det ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
This work introduces the space envelope, a shape model based upon a boundary description. Instead of modeling a single object (or solid), a space envelope encloses a volume of empty space. The advantage is that in any given view, there may be any number of objects, this number being difficult to determine from pixel-data alone. However, there is always one, and only one, volume of visible empty space. Once a model has been constructed defining the space envelope, higher-order operations may be applied to reason about the scene's content. For instance, surface geometries and topology could yield insight into the number of visible objects. The enclosed empty volume may also be used for vision-based navigation (known free-space), while the surfaces are used for view correspondences. Algorithms to automatically construct a planar boundary representation (b-rep) space envelope from a range image are presented. Results for testing the algorithms on over 400 images from four different range c...

