Results 1 - 10
of
25
Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding
, 2000
"... We show that traditional waveform-coding and 3-D model-based coding are not competing alternatives but should be combined to support and complement each other. Both approaches are combined such that the generality of waveform coding and the efficiency of 3-D model-based coding are available where ne ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
We show that traditional waveform-coding and 3-D model-based coding are not competing alternatives but should be combined to support and complement each other. Both approaches are combined such that the generality of waveform coding and the efficiency of 3-D model-based coding are available where needed. The combination is achieved by providing the block-based video coder with a second reference frame for prediction which is synthesized by the model-based coder. The model-based coder uses a parameterized 3-D head model specifying shape and color of a person. We therefore restrict our investigations to typical videotelephony scenarios that show head-and-shoulder scenes. Motion and deformation of the 3-D head model constitute facial expressions which are represented by facial animation parameters (FAPs) based on the MPEG-4 standard. An intensity gradient-based approach that exploits the 3-D model information is used to estimate the FAPs as well as illumination parameters that describe ch...
Audiovisual Speech Synthesis
- International Journal of Speech Technology
, 2001
"... This paper presents the main approaches used to synthesize talking faces, and provides greater detail on a handful of these approaches. No system is described exhaustively, however, and, for purposes of conciseness, not all existing systems are reviewed. An attempt is made to distinguish between fac ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
This paper presents the main approaches used to synthesize talking faces, and provides greater detail on a handful of these approaches. No system is described exhaustively, however, and, for purposes of conciseness, not all existing systems are reviewed. An attempt is made to distinguish between facial synthesis itself (i.e the manner in which facial movements are rendered on a computer screen), and the way these movements may be controlled and predicted using phonetic input.
3D Head Tracking Based on Recognition and Interpolation Using a Time-Of- Flight Depth Sensor
"... This paper describes a head-tracking algorithm that is based on recognition and correlation-based weighted interpolation. The input is a sequence of 3D depth images generated by a novel time-of-flight depth sensor. These are processed to segment the background and foreground, and the latter is used ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper describes a head-tracking algorithm that is based on recognition and correlation-based weighted interpolation. The input is a sequence of 3D depth images generated by a novel time-of-flight depth sensor. These are processed to segment the background and foreground, and the latter is used as the input to the head tracking algorithm, which is composed of three major modules: First, a depth signature is created out of the depth images. Next, the signature is compared against signatures that are collected in a training set of depth images. Finally, a correlation metric is calculated between most possible signature hits. The head location is calculated by interpolating among stored depth values, using the correlation metrics as the weights. This combination of depth sensing and recognition-based head tracking provides more than 90 percent success. Even if the track is temporarily lost, it is easily recovered when a good match is obtained from the training set. The use of depth images and recognition-based head tracking achieves robust real-time tracking results under extreme conditions such as 180-degree rotation, temporary occlusions, and complex backgrounds.
Animated Deformations with Radial Basis Functions
- In ACM Virtual Reality and Software Technology (VRST
, 2000
"... We present a novel approach to creating deformations of polygonal models using Radial Basis Functions (RBFs) to produce localized real-time deformations. Radial Basis Functions assume surface smoothness as a minimal constraint and animations produce smooth displacements of affected vertices in a mod ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We present a novel approach to creating deformations of polygonal models using Radial Basis Functions (RBFs) to produce localized real-time deformations. Radial Basis Functions assume surface smoothness as a minimal constraint and animations produce smooth displacements of affected vertices in a model. Animations are produced by controlling an arbitrary sparse set of control points defined on or near the surface of the model. The ability to directly manipulate a facial surface with a small number of point motions facilitates an intuitive method for creating facial expressions for virtual environment applications such as an immersive teleconferencing system or entertainment. Smooth deformations of the human face or other models are possible and illustrated with examples of a variety of expressions and mouth shapes.
Analysis and Synthesis of Facial Expressions with Hand-Generated Muscle Actuation Basis
- In IEEE Computer Animation Conference
, 2001
"... We present a performance-driven facial animation system for analyzing captured expressions to find muscle actuation and synthesizing expressions with the actuation values. Significantly different approach of our work is that we let artists sculpt the initial draft of the actuation basis---the basic ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
We present a performance-driven facial animation system for analyzing captured expressions to find muscle actuation and synthesizing expressions with the actuation values. Significantly different approach of our work is that we let artists sculpt the initial draft of the actuation basis---the basic facial shapes corresponding to the isolated actuation of individual muscles, instead of calculating skin surface deformation entirely relying on the mathematical models such as finite element methods. We synthesize expressions by linear combinations of the basis elements, and analyze expressions by finding the weights for the combinations. Even though the hand-generated actuation basis represents the essence of the subject's characteristic expressions, it is not accurate enough to be used in the subsequent computational procedures. We also describe an iterative algorithm to increase the accuracy of the actuation basis. The experimental results suggest that our artist-in-the-loop method produces more predictable and controllable outcome than pure mathematical models, thus can be a quite useful tool in animation productions.
Visual coding and tracking of speech related facial motion
- IEEE CVPR International Workshop on Cues in Communication, Hawai, USA, Decembre 9
, 2001
"... This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear manifold. This manifold is initially generated as a statistical model, obtained by analyzing accurate 3D data of a reference human subject. The FSP are then associated to the linear modes of this statistical model, resulting in a 3D parametric facial mesh. We have tested the speaker-independent hypothesis of this manifold with a model-based video tracking task applied on different subjects. Firstly, the parametric model is adapted and aligned to a subject’s face for a single shape. Then the face motion is tracked by optimally aligning the incoming video frames with the face model, textured with the first image, and deformed by varying the FSP, head rotations, and translations. We show results of the tracking for different subjects using our method. Finally, we demonstrate the facial activity encoding into the four FSP values to represent speaker-independent phonetic information. 1
Digital Watermarking of Low Bit-Rate Advanced Simple Profile MPEG-4 Compressed Video
- IEEE Trans. on Circuits and Systems for Video Technology
, 2003
"... A novel MPEG-4 compressed domain video watermarking method is proposed and its performance is studied at video bit rates ranging from 128 to 768 kb/s. The spatial spread-spectrum watermark is embedded directly to compressed MPEG-4 bitstreams by modifying DCT coefficients. A synchronization template ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
A novel MPEG-4 compressed domain video watermarking method is proposed and its performance is studied at video bit rates ranging from 128 to 768 kb/s. The spatial spread-spectrum watermark is embedded directly to compressed MPEG-4 bitstreams by modifying DCT coefficients. A synchronization template combats geometric attacks, such as cropping, scaling, and rotation. The method also features a gain control algorithm that adjusts the embedding strength of the watermark depending on local image characteristics, increasing watermark robustness or, equivalently, reducing the watermark's impact on visual quality. A drift compensator prevents the accumulation of watermark distortion and reduces watermark self-interference due to temporal prediction in inter-coded frames and AC/DC prediction in intra-coded frames. A bit-rate controller maintains the bit rate of the watermarked video within an acceptable limit. The watermark was evaluated and found to be robust against a variety of attacks, including transcoding, scaling, rotation, and noise reduction.
Computer animation: from avatars to unrestricted autonomous actors (A survey on replication and modelling mechanisms)
, 2000
"... Dealing with synthetic actors who move and behave realistically in virtual environments is a task which involves different disciplines like Mechanics, Physics, Robotics, Artificial Intelligence, Artificial Life, Biology, Cognitive Sciences and so on. In this paper we use the nature of the informatio ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Dealing with synthetic actors who move and behave realistically in virtual environments is a task which involves different disciplines like Mechanics, Physics, Robotics, Artificial Intelligence, Artificial Life, Biology, Cognitive Sciences and so on. In this paper we use the nature of the information required for controlling actors' motion and behaviour to propose a new classi"cation of synthetic actors. A description of the different motion and behaviour techniques is presented. A set of Internet addresses of the most relevant research groups, commercial companies and other related sites in this
Shape and Appearance Models of Talking Faces for Model-Based Tracking
- IN PROC. OF AVSP, ST JORIOZ
, 2003
"... This paper presents a system that can recover and track the 3D speech movements of a speaker's face for each image of a monocular sequence. A speaker-specific face model is used for tracking: model parameters are extracted from each image by an analysis-by-synthesis loop. To handle both the individu ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This paper presents a system that can recover and track the 3D speech movements of a speaker's face for each image of a monocular sequence. A speaker-specific face model is used for tracking: model parameters are extracted from each image by an analysis-by-synthesis loop. To handle both the individual specificities of the speaker's articulation and the complexity of the facial deformations during speech, speaker-specific models of the face 3D geometry and appearance are built from real data. The geometric model is linearly controlled by only six articulatory parameters. Appearance is seen either as a classical texture map or through local appearance of a relevant subset of 3D points. We compare several appearance models: they are either constant or depend linearly on the articulatory parameters. We evaluate these different appearance models with ground truth data.
3d image models and compression - synthetic hybrid or natural fit
- Proc. International Conference on Image Processing ICIP-99, Kobe
, 1999
"... This paper highlights recent advances in image compression aided by 3-D geometry information. As two examples, we present a model-aided video coder for e cient compression of head-and-shoulder scenes and a geometry-aided coder for 4-D light elds for image-based rendering. Both examples illustrate th ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This paper highlights recent advances in image compression aided by 3-D geometry information. As two examples, we present a model-aided video coder for e cient compression of head-and-shoulder scenes and a geometry-aided coder for 4-D light elds for image-based rendering. Both examples illustrate that an explicit representation of 3-D geometry is advantageous if many views of the same 3-D object or scene have to be encoded. Waveform-coding and 3-D model-based coding can be combined in a rate-distortion framework, such that the generality ofwaveform coding and the e-ciency of 3-D models are available where needed. 1.

