Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation

Ionescu, Catalin; Carreira, Joao; Sminchisescu, Cristian

Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation

Mark

Ionescu, Catalin ; Carreira, Joao and Sminchisescu, Cristian ^LU (2014) 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014 p.1661-1668

Abstract: Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For... (More); Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For robustness and generalization, we take advantage of a recent large-scale 3D human motion capture dataset, Human3.6M[18] that also has human body part labeling annotations available with images. We provide extensive experimental studies where alternative intermediate representations are compared and report a substantial 33% error reduction over competitive discriminative baselines that regress 3D human pose against global HOG features. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8227403

author

Ionescu, Catalin ; Carreira, Joao and Sminchisescu, Cristian ^LU

organization

publishing date

2014

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer graphics and computer vision

host publication

2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

pages

1661 - 1668

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014

conference location

Columbus, OH, United States

conference dates

2014-06-23 - 2014-06-28

external identifiers

wos:000361555601090
scopus:84911420074

ISSN

1063-6919

DOI

10.1109/CVPR.2014.215

language

English

LU publication?

yes

id

49f77724-9704-4715-8aba-5bc13ac83206 (old id 8227403)

date added to LUP

2016-04-01 13:09:56

date last changed

2025-10-14 10:10:51

@inproceedings{49f77724-9704-4715-8aba-5bc13ac83206,
  abstract     = {{Recently, the emergence of Kinect systems has demonstrated the benefits of predicting an intermediate body part labeling for 3D human pose estimation, in conjunction with RGB-D imagery. The availability of depth information plays a critical role, so an important question is whether a similar representation can be developed with sufficient robustness in order to estimate 3D pose from RGB images. This paper provides evidence for a positive answer, by leveraging (a) 2D human body part labeling in images, (b) second-order label-sensitive pooling over dynamically computed regions resulting from a hierarchical decomposition of the body, and (c) iterative structured-output modeling to contextualize the process based on 3D pose estimates. For robustness and generalization, we take advantage of a recent large-scale 3D human motion capture dataset, Human3.6M[18] that also has human body part labeling annotations available with images. We provide extensive experimental studies where alternative intermediate representations are compared and report a substantial 33% error reduction over competitive discriminative baselines that regress 3D human pose against global HOG features.}},
  author       = {{Ionescu, Catalin and Carreira, Joao and Sminchisescu, Cristian}},
  booktitle    = {{2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},
  issn         = {{1063-6919}},
  language     = {{eng}},
  pages        = {{1661--1668}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation}},
  url          = {{http://dx.doi.org/10.1109/CVPR.2014.215}},
  doi          = {{10.1109/CVPR.2014.215}},
  year         = {{2014}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation