Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation

Ionescu, Catalin ; Carreira, Joao and Sminchisescu, Cristian LU (2014) 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014 p.1661-1668
Abstract
Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For... (More)
Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For robustness and generalization, we take advantage of a recent large-scale 3D human motion capture dataset, Human3.6M[18] that also has human body part labeling annotations available with images. We provide extensive experimental studies where alternative intermediate representations are compared and report a substantial 33% error reduction over competitive discriminative baselines that regress 3D human pose against global HOG features. (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
pages
1661 - 1668
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
conference name
27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
conference location
Columbus, OH, United States
conference dates
2014-06-23 - 2014-06-28
external identifiers
  • wos:000361555601090
  • scopus:84911420074
ISSN
1063-6919
DOI
10.1109/CVPR.2014.215
language
English
LU publication?
yes
id
49f77724-9704-4715-8aba-5bc13ac83206 (old id 8227403)
date added to LUP
2016-04-01 13:09:56
date last changed
2022-04-21 20:07:38
@inproceedings{49f77724-9704-4715-8aba-5bc13ac83206,
  abstract     = {{Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For robustness and generalization, we take advantage of a recent large-scale 3D human motion capture dataset, Human3.6M[18] that also has human body part labeling annotations available with images. We provide extensive experimental studies where alternative intermediate representations are compared and report a substantial 33% error reduction over competitive discriminative baselines that regress 3D human pose against global HOG features.}},
  author       = {{Ionescu, Catalin and Carreira, Joao and Sminchisescu, Cristian}},
  booktitle    = {{2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},
  issn         = {{1063-6919}},
  language     = {{eng}},
  pages        = {{1661--1668}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation}},
  url          = {{http://dx.doi.org/10.1109/CVPR.2014.215}},
  doi          = {{10.1109/CVPR.2014.215}},
  year         = {{2014}},
}