Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video
(2022) 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022- Abstract
- We focus on the task of estimating a physically plausi-ble articulated human motion from monocular video. Ex-isting approaches that do not consider physics often pro-duce temporally inconsistent output with motion artifacts, while state-of-the-art physics-based approaches have either been shown to work only in controlled laboratory conditions or consider simplified body-ground contact limited to feet. This paper explores how these shortcomings can be addressed by directly incorporating a fully-featured physics engine into the pose estimation process. Given an uncon-trolled, real-world scene as input, our approach estimates the ground-plane location and the dimensions of the physi-cal body model. It then recovers the physical motion by... (More)
- We focus on the task of estimating a physically plausi-ble articulated human motion from monocular video. Ex-isting approaches that do not consider physics often pro-duce temporally inconsistent output with motion artifacts, while state-of-the-art physics-based approaches have either been shown to work only in controlled laboratory conditions or consider simplified body-ground contact limited to feet. This paper explores how these shortcomings can be addressed by directly incorporating a fully-featured physics engine into the pose estimation process. Given an uncon-trolled, real-world scene as input, our approach estimates the ground-plane location and the dimensions of the physi-cal body model. It then recovers the physical motion by per-forming trajectory optimization. The advantage of our for-mulation is that it readily generalizes to a variety of scenes that might have diverse ground properties and supports any form of self-contact and contact between the articu-lated body and scene geometry. We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark [13], while being directly applicable without re-training to more complex dynamic motions from the AIST benchmark [36] and to uncontrolled internet videos. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/f64363d1-5923-47d7-a032-bf55daab9391
- author
- Gärtner, Erik LU ; Andriluka, Mykhaylo ; Xu, Hongyi and Sminchisescu, Cristian LU
- organization
- publishing date
- 2022
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- conference name
- 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
- conference location
- New Orleans, United States
- conference dates
- 2022-06-19 - 2022-06-24
- external identifiers
-
- scopus:85132477539
- ISBN
- 978-1-6654-6947-0
- 978-1-6654-6946-3
- DOI
- 10.1109/CVPR52688.2022.01276
- language
- English
- LU publication?
- yes
- id
- f64363d1-5923-47d7-a032-bf55daab9391
- date added to LUP
- 2022-05-06 10:48:26
- date last changed
- 2024-10-04 04:58:48
@inproceedings{f64363d1-5923-47d7-a032-bf55daab9391, abstract = {{We focus on the task of estimating a physically plausi-ble articulated human motion from monocular video. Ex-isting approaches that do not consider physics often pro-duce temporally inconsistent output with motion artifacts, while state-of-the-art physics-based approaches have either been shown to work only in controlled laboratory conditions or consider simplified body-ground contact limited to feet. This paper explores how these shortcomings can be addressed by directly incorporating a fully-featured physics engine into the pose estimation process. Given an uncon-trolled, real-world scene as input, our approach estimates the ground-plane location and the dimensions of the physi-cal body model. It then recovers the physical motion by per-forming trajectory optimization. The advantage of our for-mulation is that it readily generalizes to a variety of scenes that might have diverse ground properties and supports any form of self-contact and contact between the articu-lated body and scene geometry. We show that our approach achieves competitive results with respect to existing physics-based methods on the Human3.6M benchmark [13], while being directly applicable without re-training to more complex dynamic motions from the AIST benchmark [36] and to uncontrolled internet videos.}}, author = {{Gärtner, Erik and Andriluka, Mykhaylo and Xu, Hongyi and Sminchisescu, Cristian}}, booktitle = {{Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}}, isbn = {{978-1-6654-6947-0}}, language = {{eng}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, title = {{Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video}}, url = {{http://dx.doi.org/10.1109/CVPR52688.2022.01276}}, doi = {{10.1109/CVPR52688.2022.01276}}, year = {{2022}}, }