Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video

Gärtner, Erik LU orcid ; Andriluka, Mykhaylo ; Xu, Hongyi and Sminchisescu, Cristian LU (2022) 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Abstract
We focus on the task of estimating a physically plausi-ble articulated human motion from monocular video. Ex-isting approaches that do not consider physics often pro-duce temporally inconsistent output with motion artifacts, while state-of-the-art physics-based approaches have either been shown to work only in controlled laboratory conditions or consider simplified body-ground contact limited to feet. This paper explores how these shortcomings can be addressed by directly incorporating a fully-featured physics engine into the pose estimation process. Given an uncon-trolled, real-world scene as input, our approach estimates the ground-plane location and the dimensions of the physi-cal body model. It then recovers the physical motion by... (More)
We focus on the task of estimating a physically plausi-ble articulated human motion from monocular video. Ex-isting approaches that do not consider physics often pro-duce temporally inconsistent output with motion artifacts, while state-of-the-art physics-based approaches have either been shown to work only in controlled laboratory conditions or consider simplified body-ground contact limited to feet. This paper explores how these shortcomings can be addressed by directly incorporating a fully-featured physics engine into the pose estimation process. Given an uncon-trolled, real-world scene as input, our approach estimates the ground-plane location and the dimensions of the physi-cal body model. It then recovers the physical motion by per-forming trajectory optimization. The advantage of our for-mulation is that it readily generalizes to a variety of scenes that might have diverse ground properties and supports any form of self-contact and contact between the articu-lated body and scene geometry. We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark [13], while being directly applicable without re-training to more complex dynamic motions from the AIST benchmark [36] and to uncontrolled internet videos. (Less)
Please use this url to cite or link to this publication:
author
; ; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
conference name
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
conference location
New Orleans, United States
conference dates
2022-06-19 - 2022-06-24
external identifiers
  • scopus:85132477539
ISBN
978-1-6654-6946-3
978-1-6654-6947-0
DOI
10.1109/CVPR52688.2022.01276
language
English
LU publication?
yes
id
f64363d1-5923-47d7-a032-bf55daab9391
date added to LUP
2022-05-06 10:48:26
date last changed
2024-04-18 12:10:09
@inproceedings{f64363d1-5923-47d7-a032-bf55daab9391,
  abstract     = {{We focus on the task of estimating a physically plausi-ble articulated human motion from monocular video. Ex-isting approaches that do not consider physics often pro-duce temporally inconsistent output with motion artifacts, while state-of-the-art physics-based approaches have either been shown to work only in controlled laboratory conditions or consider simplified body-ground contact limited to feet. This paper explores how these shortcomings can be addressed by directly incorporating a fully-featured physics engine into the pose estimation process. Given an uncon-trolled, real-world scene as input, our approach estimates the ground-plane location and the dimensions of the physi-cal body model. It then recovers the physical motion by per-forming trajectory optimization. The advantage of our for-mulation is that it readily generalizes to a variety of scenes that might have diverse ground properties and supports any form of self-contact and contact between the articu-lated body and scene geometry. We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark [13], while being directly applicable without re-training to more complex dynamic motions from the AIST benchmark [36] and to uncontrolled internet videos.}},
  author       = {{Gärtner, Erik and Andriluka, Mykhaylo and Xu, Hongyi and Sminchisescu, Cristian}},
  booktitle    = {{Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}},
  isbn         = {{978-1-6654-6946-3}},
  language     = {{eng}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video}},
  url          = {{http://dx.doi.org/10.1109/CVPR52688.2022.01276}},
  doi          = {{10.1109/CVPR52688.2022.01276}},
  year         = {{2022}},
}