Results 1 - 10
of
26
Voice puppetry
, 1999
"... Frames from a voice-driven animation, computed from a single baby picture and an adult model of facial control. Note the changes in upper facial expression. See figures 5, 6 and 7 for more examples of predicted mouth shapes. We introduce a method for predicting a control signal from another related ..."
Abstract
-
Cited by 190 (0 self)
- Add to MetaCart
Frames from a voice-driven animation, computed from a single baby picture and an adult model of facial control. Note the changes in upper facial expression. See figures 5, 6 and 7 for more examples of predicted mouth shapes. We introduce a method for predicting a control signal from another related signal, and apply it to voice puppetry: Generating full facial animation from expressive information in an audio track. The voice puppet learns a facial control model from computer vision of real facial behavior, automatically incorporating vocal and facial dynamics such as co-articulation. Animation is produced by using audio to drive the model, which induces a probability distribution over the manifold of possible facial motions. We present a lineartime closed-form solution for the most probable trajectory over this manifold. The output is a series of facial control parameters, suitable for driving many different kinds of animation ranging from video-realistic image warps to 3D cartoon characters.
Music Summarization Using Key Phrases
- In Proc. IEEE ICASSP
, 2000
"... Systems to automatically provide a representative summary or 'Key Phrase' of a piece of music axe described. For a 'rock' song with 'verse' and 'chorus' sections, we aim to return the chorus or in any case the most repeated and hence most memorable section. The techniques axe less applicable to musi ..."
Abstract
-
Cited by 66 (1 self)
- Add to MetaCart
Systems to automatically provide a representative summary or 'Key Phrase' of a piece of music axe described. For a 'rock' song with 'verse' and 'chorus' sections, we aim to return the chorus or in any case the most repeated and hence most memorable section. The techniques axe less applicable to music with more complicated structure although possibly our general framework could still be used with different heuristics.
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
, 1998
"... We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum... ..."
Abstract
-
Cited by 59 (0 self)
- Add to MetaCart
We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum...
Impact of Dynamic Model Learning on Classification of Human Motion
- In Proc. International Conference on Computer Vision and Pattern Recognition
, 2000
"... The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. However, most work on tracking and analysis of figure motion has employed either generic or highly specific handtailored dynamic models superficially coupled with hidden Markov models (HMMs) of motio ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. However, most work on tracking and analysis of figure motion has employed either generic or highly specific handtailored dynamic models superficially coupled with hidden Markov models (HMMs) of motion regimes. Recently, an alternative class of learned dynamic models known as switching linear dynamic systems (SLDSs) has been cast in the framework of dynamic Bayesian networks (DBNs) and applied to analysis and tracking of the human figure. In this paper we further study the impact of learned SLDS models on analysis and tracking of human motion and contrast them to the more common HMM models. We develop a novel approximate structured variational inference algorithm for SLDS, a globally convergent DBN inference scheme, and compare it with standard SLDS inference techniques. Experimental results on learning and analysis of figure dynamics from video data indicate the significant potential of...
An Entropic Estimator for Structure Discovery
, 1999
"... We introduce a novel framework for simultaneous structure and parameter learning in hidden-variable conditional probability models, based on an entropic prior and a solution for its maximum a posteriori (MAP) estimator. The MAP estimate minimizes uncertainty in all respects: cross-entropy between mo ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
We introduce a novel framework for simultaneous structure and parameter learning in hidden-variable conditional probability models, based on an entropic prior and a solution for its maximum a posteriori (MAP) estimator. The MAP estimate minimizes uncertainty in all respects: cross-entropy between model and data; entropy of the model; entropy of the data's descriptive statistics. Iterative estimation extinguishes weakly supported parameters, compressing and sparsifying the model. Trimming operators accelerate this process by removing excess parameters and, unlike most pruning schemes, guarantee an increase in posterior probability. Entropic estimation takes a overcomplete random model and simplifies it, inducing the structure of relations between hidden and observed variables. Applied to hidden Markov models (HMMs), it finds a concise finite-state machine representing the hidden structure of a signal. We entropically model music, handwriting, and video time-series, and show that the res...
Representation and Recognition of Complex Human Motion
"... The quest for a vision system capable of representing and recognizing arbitrary motions benefits from a low dimensional, non-specific representation of flow fields, to be used in high level classification tasks. We present Zernike polynomials as an ideal candidate for such a representation. The basi ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
The quest for a vision system capable of representing and recognizing arbitrary motions benefits from a low dimensional, non-specific representation of flow fields, to be used in high level classification tasks. We present Zernike polynomials as an ideal candidate for such a representation. The basis of Zernike polynomials is complete and orthogonal, and can be used for describing many types of motion at many scales. Starting from image sequences, locally smooth image velocities are derived using a robust estimation procedure, from which are computed compact representations of the flow using the Zernike basis. Continuous density hidden Markov models are trained using the temporal sequences of vectors thus obtained, and are used for subsequent classification. We present results of our method applied to image sequences of facial expressions both with and without significant rigid head motion and to sequences of lip motion from a known database. We demonstrate that the Zernike representation yields results competitive with those obtained using principal components, while not committing to specific types of motion. It is therefore ideal as a fundamental building block for a vision system capable of classifying arbitrary motion types.
Discriminative, Generative and Imitative Learning
, 2002
"... I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specif ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specific knowledge in terms of structure and parameter priors over the joint space of variables. Bayesian networks and Bayesian statistics provide a rich and flexible language for specifying this knowledge and subsequently refining it with data and observations. The final result is a distribution that is a good generator of novel exemplars.
Differentiable Sparse Coding
"... Prior work has shown that features which appear to be biologically plausible as well as empirically useful can be found by sparse coding with a prior such as a laplacian (L1) that promotes sparsity. We show how smoother priors can preserve the benefits of these sparse priors while adding stability t ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Prior work has shown that features which appear to be biologically plausible as well as empirically useful can be found by sparse coding with a prior such as a laplacian (L1) that promotes sparsity. We show how smoother priors can preserve the benefits of these sparse priors while adding stability to the Maximum A-Posteriori (MAP) estimate that makes it more useful for prediction problems. Additionally, we show how to calculate the derivative of the MAP estimate efficiently with implicit differentiation. One prior that can be differentiated this way is KL-regularization. We demonstrate its effectiveness on a wide variety of applications, and find that online optimization of the parameters of the KL-regularized model can significantly improve prediction performance. 1

