Results 1 - 10
of
75
Classifying Facial Actions
- IEEE Trans. Pattern Anal and Machine Intell
, 1999
"... AbstractÐThe Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trai ..."
Abstract
-
Cited by 201 (18 self)
- Add to MetaCart
AbstractÐThe Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions.
Rotation invariant neural network-based face detection
, 1998
"... In this paper, we present a neural network-based face detection system. Unlike similar systems which are limited to detecting upright, frontal faces, this system detects faces at any degree of rotation in the image plane. The system employs multiple networks; a “router ” network first processes each ..."
Abstract
-
Cited by 150 (3 self)
- Add to MetaCart
In this paper, we present a neural network-based face detection system. Unlike similar systems which are limited to detecting upright, frontal faces, this system detects faces at any degree of rotation in the image plane. The system employs multiple networks; a “router ” network first processes each input window to determine its orientation and then uses this information to prepare the window for one or more “detector ” networks. We present the training methods for both types of networks. We also perform sensitivity analysis on the networks, and present empirical results on a large test set. Finally, we present preliminary results for detecting faces rotated out of the image plane, such as profiles and semi-profiles. 1.
Image Mosaicing for Tele-Reality Applications
, 1994
"... While a large number of virtual reality applications, such as fluid flow analysis and molecular modeling, deal with simulated data, many newer applications attempt to recreate true reality as convincingly as possible. Building detailed models for such applications, which we call tele-reality, is a m ..."
Abstract
-
Cited by 145 (11 self)
- Add to MetaCart
While a large number of virtual reality applications, such as fluid flow analysis and molecular modeling, deal with simulated data, many newer applications attempt to recreate true reality as convincingly as possible. Building detailed models for such applications, which we call tele-reality, is a major bottleneck holding back their deployment. In this paper, we present techniques for automatically deriving realistic 2-D scenes and 3-D texture-mapped models from video sequences, which can help overcome this bottleneck. The fundamental technique we use is image mosaicing, i.e., the automatic alignment of multiple images into larger aggregates which are then used to represent portions of a 3-D scene. We begin with the easiest problems, those of flat scene and panoramic scene mosaicing, and progress to more complicated scenes, culminating in full 3-D models. We also present a number of novel applications based on tele-reality technology.
An Image-Based Approach To Three-Dimensional Computer Graphics
, 1997
"... Leonard McMillan Jr. An Image-Based Approach to Three-Dimensional Computer Graphics (Under the direction of Gary Bishop) The conventional approach to three-dimensional computer graphics produces images from geometric scene descriptions by simulating the interaction of light with matter. My research ..."
Abstract
-
Cited by 144 (4 self)
- Add to MetaCart
Leonard McMillan Jr. An Image-Based Approach to Three-Dimensional Computer Graphics (Under the direction of Gary Bishop) The conventional approach to three-dimensional computer graphics produces images from geometric scene descriptions by simulating the interaction of light with matter. My research explores an alternative approach that replaces the geometric scene description with perspective images and replaces the simulation process with data interpolation. I derive an image-warping equation that maps the visible points in a reference image to their correct positions in any desired view. This mapping from reference image to desired image is determined by the center-of-projection and pinhole-camera model of the two images and by a generalized disparity value associated with each point in the reference image. This generalized disparity value, which represents the structure of the scene, can be determined from point correspondences between multiple reference images. The image-warpi...
Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion
- International Journal of Computer Vision
, 1997
"... This paper explores the use of local parametrized models of image motion for recovering and recognizing the non-rigid and articulated motion of human faces. Parametric flow models (for example affine) are popular for estimating motion in rigid scenes. We observe that within local regions in space an ..."
Abstract
-
Cited by 133 (11 self)
- Add to MetaCart
This paper explores the use of local parametrized models of image motion for recovering and recognizing the non-rigid and articulated motion of human faces. Parametric flow models (for example affine) are popular for estimating motion in rigid scenes. We observe that within local regions in space and time, such models not only accurately model non-rigid facial motions but also provide a concise description of the motion in terms of a small number of parameters. These parameters are intuitively related to the motion of facial features during facial expressions and we show how expressions such as anger, happiness, surprise, fear, disgust, and sadness can be recognized from the local parametric motions in the presence of significant head motion. The motion tracking and expression recognition approach performed with high accuracy in extensive laboratory experiments involving 40 subjects as well as in television and movie sequences.
Trainable Videorealistic Speech Animation
- PROCEEDINGS OF SIGGRAPH 2002, SAN ANTONIO TEXAS
, 2002
"... We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from th ..."
Abstract
-
Cited by 110 (5 self)
- Add to MetaCart
We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned. The two key
Face Recognition From One Example View
, 1995
"... To create a pose-invariant face recognizer, one strategy is the view-based approach, which uses a set of example views at different poses. But what if we only have one example view available, such as a scanned passport photo -- can we still recognize faces under different poses? Given one example vi ..."
Abstract
-
Cited by 110 (5 self)
- Add to MetaCart
To create a pose-invariant face recognizer, one strategy is the view-based approach, which uses a set of example views at different poses. But what if we only have one example view available, such as a scanned passport photo -- can we still recognize faces under different poses? Given one example view at a known pose, it is still possible to use the view-based approach by exploiting prior knowledge of faces to generate virtual views, or views of the face as seen from different poses. To represent prior knowledge, we use 2D example views of prototype faces under different rotations. We will develop example-based techniques for applying the rotation seen in the prototypes to essentially "rotate" the single real view which is available. Next, the combined set of one real and multiple virtual views is used as example views in a view-based, pose-invariant face recognizer. Our experiments suggest that for expressing prior knowledge of faces, 2D example-based approaches should be considered ...
Learning Spatially Localized, Parts-Based Representation
, 2001
"... In this paper, we propose a novel method, called local nonnegative matrix factorization (LNMF), for learning spatially localized, parts-based subspace representation of visual patterns. An objective function is defined to impose localization constraint, in addition to the non-negativity constraint i ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
In this paper, we propose a novel method, called local nonnegative matrix factorization (LNMF), for learning spatially localized, parts-based subspace representation of visual patterns. An objective function is defined to impose localization constraint, in addition to the non-negativity constraint in the standard NMF [1]. This gives a set of bases which not only allows a non-subtractive (part-based) representation of images but also manifests localized features. An algorithm is presented for the learning of such basis components. Experimental results are presented to compare LNMF with the NMF and PCA methods for face representation and recognition, which demonstrates advantages of LNMF.
Novel View Synthesis in Tensor Space
- In Proc. of IEEE Conference on Computer Vision and Pattern Recognition
, 1997
"... We present a new method for synthesizing novel views of a 3D scene from few model images in full correspondence. The core of this work is the derivation of a tensorial operator that describes the transformation from a given tensor of three views to a novel tensor of a new configuration of three view ..."
Abstract
-
Cited by 83 (8 self)
- Add to MetaCart
We present a new method for synthesizing novel views of a 3D scene from few model images in full correspondence. The core of this work is the derivation of a tensorial operator that describes the transformation from a given tensor of three views to a novel tensor of a new configuration of three views. By repeated application of the operator on a seed tensor with a sequence of desired virtual camera positions we obtain a chain of warping functions (tensors) from the set of model images to create the desired virtual views. 1. Introduction This paper addresses the problem of synthesizing a novel image, from an arbitrary viewing position, given a small number of model images (registered by means of an opticflow engine) of the 3D scene. The most significant aspect of our approach is the ability to synthesize images that are far away from the viewing positions of the sample model images without ever computing explicitly any 3D information about the scene. This property provides a multi-imag...
Discovering Structure in Multiple Learning Tasks: The TC Algorithm
- In International Conference on Machine Learning
, 1996
"... Recently, there has been an increased interest in "lifelong " machine learning methods, that transfer knowledge across multiple learning tasks. Such methods have repeatedly been found to outperform conventional, single-task learning algorithms when the learning tasks are appropriately related. To in ..."
Abstract
-
Cited by 69 (3 self)
- Add to MetaCart
Recently, there has been an increased interest in "lifelong " machine learning methods, that transfer knowledge across multiple learning tasks. Such methods have repeatedly been found to outperform conventional, single-task learning algorithms when the learning tasks are appropriately related. To increase robustness of such approaches, methods are desirable that can reason about the relatedness of individuallearning tasks, in order to avoid the danger arising from tasks that are unrelated and thus potentially misleading. This paper describes the task-clustering (TC) algorithm. TC clusters learning tasks into classes of mutually related tasks. When facing a new learning task, TC first determines the most related task cluster, then exploits information selectively from this task cluster only. An empirical study carried out in a mobile robot domain shows that TC outperforms its non-selective counterpart in situations where only a small number of tasks is relevant. 1 INTRODUCTION One of t...

