• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

Recovering 3D Human Pose from Monocular Images

Cached

  • Download as a PDF

Download Links

  • [lear.inrialpes.fr]
  • [cgit.nutn.edu.tw:8080]
  • [eprints.pascal-network.org]
  • [www-ljk.imag.fr]
  • [research.microsoft.com]
  • [hal.inria.fr]
  • [hal.inria.fr]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Ankur Agarwal , Bill Triggs
Citations:260 - 0 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Agarwal_recovering3d,
    author = {Ankur Agarwal and Bill Triggs},
    title = { Recovering 3D Human Pose from Monocular Images},
    year = {}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

We describe a learning based method for recovering 3D human body pose from single images and monocular image sequences. Our approach requires neither an explicit body model nor prior labelling of body parts in the image. Instead, it recovers pose by direct nonlinear regression against shape descriptor vectors extracted automatically from image silhouettes. For robustness against local silhouette segmentation errors, silhouette shape is encoded by histogram-of-shape-contexts descriptors. We evaluate several different regression methods: ridge regression, Relevance Vector Machine (RVM) regression and Support Vector Machine (SVM) regression over both linear and kernel bases. The RVMs provide much sparser regressors without compromising performance, and kernel bases give a small but worthwhile improvement in performance. Loss of depth and limb labelling information often makes the recovery of 3D pose from single silhouettes ambiguous. We propose two solutions to this: the first embeds the method in a tracking framework, using dynamics from the previous state estimate to disambiguate the pose; the second uses a mixture of regressors framework to return multiple solutions for each silhouette. We show that the resulting system tracks long sequences stably, and is also capable of accurately reconstructing 3D human pose from single images, giving multiple possible solutions in ambiguous cases. For realism and good generalization over a wide range of viewpoints, we train the regressors on images resynthesized from real human motion capture data. The method is demonstrated on a 54-parameter full body pose model, both quantitatively on independent but similar test data, and qualitatively on real image sequences. Mean angular errors of 4–5 degrees are obtained — a factor of 3 better than the current state of the art for the much simpler upper body problem.

Keyphrases

human pose    monocular image    single image    kernel base    local silhouette segmentation error    silhouette shape    explicit body model    worthwhile improvement    ambiguous case    single silhouette    wide range    previous state estimate    histogram-of-shape-contexts descriptor    multiple solution    shape descriptor vector    body problem    54-parameter full body pose model    human body    real human motion capture data    real image sequence    similar test data    prior labelling    mean angular error    multiple possible solution    several different regression method    direct nonlinear regression    relevance vector machine    support vector machine    monocular image sequence    tracking framework    image silhouette    much simpler    current state    body part    system track long sequence    ridge regression    good generalization   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University