Results 1 - 10
of
37
Articulated Mesh Animation from Multi-view Silhouettes
- ACM TRANSACTIONS ON GRAPHICS
, 2008
"... Details in mesh animations are difficult to generate but they have great impact on visual quality. In this work, we demonstrate a practical software system for capturing such details from multi-view video recordings. Given a stream of synchronized video images that record a human performance from mu ..."
Abstract
-
Cited by 168 (6 self)
- Add to MetaCart
Details in mesh animations are difficult to generate but they have great impact on visual quality. In this work, we demonstrate a practical software system for capturing such details from multi-view video recordings. Given a stream of synchronized video images that record a human performance from multiple viewpoints and an articulated template of the performer, our system captures the motion of both the skeleton and the shape. The output mesh animation is enhanced with the details observed in the image silhouettes. For example, a performance in casual loose-fitting clothes will generate mesh animations with flowing garment motions. We accomplish this with a fast pose tracking method followed by nonrigid deformation of the template to fit the silhouettes. The entire process takes less than sixteen seconds per frame and requires no markers or texture cues. Captured meshes are in full correspondence making them readily usable for editing operations including texturing, deformation transfer, and deformation model learning.
Estimating Human Shape and Pose from a Single Image
"... We describe a solution to the challenging problem of estimating human body shape from a single photograph or painting. Our approach computes shape and pose parameters of a 3D human body model directly from monocular image cues and advances the state of the art in several directions. First, given a u ..."
Abstract
-
Cited by 52 (6 self)
- Add to MetaCart
(Show Context)
We describe a solution to the challenging problem of estimating human body shape from a single photograph or painting. Our approach computes shape and pose parameters of a 3D human body model directly from monocular image cues and advances the state of the art in several directions. First, given a user-supplied estimate of the subject’s height and a few clicked points on the body we estimate an initial 3D articulated body pose and shape. Second, using this initial guess we generate a tri-map of regions inside, outside and on the boundary of the human, which is used to segment the image using graph cuts. Third, we learn a low-dimensional linear model of human shape in which variations due to height are concentrated along a single dimension, enabling height-constrained estimation of body shape. Fourth, we formulate the problem of parametric human shape from shading. We estimate the body pose, shape and reflectance as well as the scene lighting that produces a synthesized body that robustly matches the image evidence. Quantitative experiments demonstrate how smooth shading provides powerful constraints on human shape. We further demonstrate a novel application in which we extract 3D human models from archival photographs and paintings. 1.
Dynamic Shape Capture using Multi-View Photometric Stereo
- In ACM Transactions on Graphics
"... Figure 1: Our system rapidly acquires images under varying illumination in order to compute photometric normals from multiple viewpoints. The normals are then used to reconstruct detailed mesh sequences of dynamic shapes such as human performers. We describe a system for high-resolution capture of m ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
Figure 1: Our system rapidly acquires images under varying illumination in order to compute photometric normals from multiple viewpoints. The normals are then used to reconstruct detailed mesh sequences of dynamic shapes such as human performers. We describe a system for high-resolution capture of moving 3D geometry, beginning with dynamic normal maps from multiple views. The normal maps are captured using active shape-from-shading (photometric stereo), with a large lighting dome providing a series of novel spherical lighting configurations. To compensate for low-frequency deformation, we perform multi-view matching and thin-plate spline deformation on the initial surfaces obtained by integrating the normal maps. Next, the corrected meshes are merged into a single mesh using a volumetric method. The final output is a set of meshes, which were impossible to produce with previous methods. The meshes exhibit details on the order of a few millimeters, and represent the performance over human-size working volumes at a temporal resolution of 60Hz. 1
Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry
- IN PROC. IEEE CONF. ON COMPUTER VISION AND PATTERN RECOGNITION
, 2008
"... This paper describes a new passive approach to capture time-varying scene geometry in large acquisition volumes from multi-view video. It can be applied to reconstruct complete moving models of human actors that feature even slightest dynamic geometry detail, such as wrinkles and folds in clothing, ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
(Show Context)
This paper describes a new passive approach to capture time-varying scene geometry in large acquisition volumes from multi-view video. It can be applied to reconstruct complete moving models of human actors that feature even slightest dynamic geometry detail, such as wrinkles and folds in clothing, and that can be viewed from 360 ◦. Starting from multi-view video streams recorded under calibrated lighting, we first perform marker-less human motion capture based on a smooth template with no highfrequency surface detail. Subsequently, surface reflectance and time-varying normal fields are estimated based on the coarse template shape. The main contribution of this paper is a new statistical approach to solve the non-trivial problem of transforming the captured normal field that is defined over the smooth non-planar 3D template into true 3D displacements. Our spatio-temporal reconstruction method outputs displaced geometry that is accurate at each time step of video and temporally smooth, even if the input data are affected by noise.
Shading-based Dynamic Shape Refinement from Multi-view Video under General Illumination
"... We present an approach to add true fine-scale spatiotemporal shape detail to dynamic scene geometry captured from multi-view video footage. Our approach exploits shading information to recover the millimeter-scale surface structure, but in contrast to related approaches succeeds under general uncons ..."
Abstract
-
Cited by 22 (8 self)
- Add to MetaCart
(Show Context)
We present an approach to add true fine-scale spatiotemporal shape detail to dynamic scene geometry captured from multi-view video footage. Our approach exploits shading information to recover the millimeter-scale surface structure, but in contrast to related approaches succeeds under general unconstrained lighting conditions. Our method starts off from a set of multi-view video frames and an initial series of reconstructed coarse 3D meshes that lack any surface detail. In a spatio-temporal maximum a posteriori probability (MAP) inference framework, our approach first estimates the incident illumination and the spatiallyvarying albedo map on the mesh surface for every time instant. Thereafter, albedo and illumination are used to estimate the true geometric detail visible in the images and add it to the coarse reconstructions. The MAP framework uses weak temporal priors on lighting, albedo and geometry which improve reconstruction quality yet allow for temporal variations in the data. 1.
Temporally Coherent Completion of Dynamic Shapes
"... We present a novel shape completion technique for creating temporally coherent watertight surfaces from real-time captured dynamic performances. Because of occlusions and low surface albedo, scanned mesh sequences typically exhibit large holes that persist over extended periods of time. Most convent ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
We present a novel shape completion technique for creating temporally coherent watertight surfaces from real-time captured dynamic performances. Because of occlusions and low surface albedo, scanned mesh sequences typically exhibit large holes that persist over extended periods of time. Most conventional dynamic shape reconstruction techniques rely on template models or assume slow deformations in the input data. Our framework sidesteps these requirements and directly initializes shape completion with topology derived from the visual hull. To seal the holes with patches that are consistent with the subject’s motion, we first minimize surface bending energies in each frame to ensure smooth transitions across hole boundaries. Temporally coherent dynamics of surface patches are obtained by unwarping all frames within a time window using accurate inter-frame correspondences. Aggregated surface samples are then filtered with a temporal visibility kernel that maximizes the use of non-occluded surfaces. A key benefit of our shape completion strategy is that it does not rely on long-range correspondences or a template model. Consequently, our method does not suffer from error accumulation typically introduced by noise, large deformations, and drastic topological changes. We illustrate the effectiveness of our method on several high-resolution scans of human performances captured with a state-of-the-art multi-view 3D acquisition system.
Transparent and Specular Object Reconstruction
, 2010
"... This state of the art report covers reconstruction methods for transparent and specular objects or phenomena. While the 3D acquisition of opaque surfaces with lambertian reflectance is a well-studied problem, transparent, refractive, specular and potentially dynamic scenes pose challenging problems ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
This state of the art report covers reconstruction methods for transparent and specular objects or phenomena. While the 3D acquisition of opaque surfaces with lambertian reflectance is a well-studied problem, transparent, refractive, specular and potentially dynamic scenes pose challenging problems for acquisition systems. This report reviews and categorizes the literature in this field. Despite tremendous interest in object digitization, the acquisition of digital models of transparent or specular objects is far from being a solved problem. On the other hand, real-world data is in high demand for applications such as object modeling, preservation of historic artifacts and as input to data-driven modeling techniques. With this report we aim at providing a reference for and an introduction to the field of transparent and specular object reconstruction. We describe acquisition approaches for different classes of objects. Transparent objects/phenomena that do not change the straight ray geometry can be found foremost in natural phenomena. Refraction effects are usually small and can be considered negligible for these objects. Phenomena as diverse as fire, smoke, and interstellar nebulae can be modeled using a straight ray model of image formation. Refractive and specular surfaces on the other hand change the straight rays into usually piecewise linear ray paths, adding additional complexity to the reconstruction problem. Translucent objects exhibit significant sub-surface scattering effects rendering traditional acquisition approaches unstable. Different classes of techniques have been developed to deal with these problems and good reconstruction results can be achieved with current state-of-the-art techniques. However, the approaches are still specialized and targeted at very specific object classes. We classify the existing literature and hope to provide an entry point to this exiting field.
Human motion synthesis from 3d video
- In Proc. CVPR
, 2009
"... Multiple view 3D video reconstruction of actor performance captures a level-of-detail for body and clothing movement which is time-consuming to produce using existing animation tools. In this paper we present a framework for concatenative synthesis from multiple 3D video sequences according to user ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
(Show Context)
Multiple view 3D video reconstruction of actor performance captures a level-of-detail for body and clothing movement which is time-consuming to produce using existing animation tools. In this paper we present a framework for concatenative synthesis from multiple 3D video sequences according to user constraints on movement, position and timing. Multiple 3D video sequences of an actor performing different movements are automatically constructed into a surface motion graph which represents the possible transitions with similar shape and motion between sequences without unnatural movement artefacts. Shape similarity over an adaptive temporal window is used to identify transitions between 3D video sequences. Novel 3D video sequences are synthesized by finding the optimal path in the surface motion graph between user specified key-frames for control of movement, location and timing. The optimal path which satisfies the user constraints whilst minimizing the total transition cost between 3D video sequences is found using integer linear programming. Results demonstrate that this framework allows flexible production of novel 3D video sequences which preserve the detailed dynamics of the captured movement for actress with loose clothing and long hair without visible artefacts. 1.
Superresolution texture maps for multiview reconstruction
- In ICCV ’09
, 2009
"... We study the scenario of a multiview setting, where several calibrated views of a textured object with known surface geometry are available. The objective is to estimate a diffuse texture map as precisely as possible. A superresolution image formation model based on the camera properties leads to a ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
(Show Context)
We study the scenario of a multiview setting, where several calibrated views of a textured object with known surface geometry are available. The objective is to estimate a diffuse texture map as precisely as possible. A superresolution image formation model based on the camera properties leads to a total variation energy for the desired texture map, which can be recovered as the minimizer of the functional by solving the Euler-Lagrange equation on the surface. The PDE is transformed to planar texture space via an automatically created conformal atlas, where it can be solved using total variation deblurring. The proposed approach allows to recover a high-resolution, high-quality texture map even from lower-resolution photographs, which is of interest for a variety of image-based modeling applications. 1.
3-D Time-Varying Scene Capture Technologies -- A Survey
, 2007
"... Advances in image sensors and evolution of digital computation is a strong stimulus for development and implementation of sophisticated methods for capturing, processing and analysis of 3-D data from dynamic scenes. Research on perspective time-varying 3-D scene capture technologies is important for ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Advances in image sensors and evolution of digital computation is a strong stimulus for development and implementation of sophisticated methods for capturing, processing and analysis of 3-D data from dynamic scenes. Research on perspective time-varying 3-D scene capture technologies is important for the upcoming 3DTV displays. Methods such as shape-from-texture, shape-from-shading, shape-from-focus, and shape-from-motion extraction can restore 3-D shape information from a single camera data. The existing techniques for 3-D extraction from single-camera video sequences are especially useful for conversion of the already available vast mono-view content to the 3DTV systems. Sceneoriented single-camera methods such as human face reconstruction and facial motion analysis, body modeling and body motion tracking, and motion recognition solve efficiently a variety of tasks. 3-D multicamera dynamic acquisition and reconstruction, their hardware specifics including calibration and synchronization and software demands form another area of intensive research. Different classes of multiview stereo algorithms such as those based on cost function computing and optimization, fusing of multiple views, and feature-point reconstruction are possible candidates for dynamic 3-D reconstruction. High-resolution digital holography and pattern projection techniques such as coded light or fringe projection for real-time extraction of 3-D object positions and color information could manifest themselves as an alternative to traditional camera-based methods. Apart from all of these approaches, there also are some active imaging devices capable of 3-D extraction such as the 3-D time-of-flight camera, which provides 3-D image data of its environment by means of a modulated infrared light source.