Differentiable Dynamics for Articulated 3d Human Motion Reconstruction
(2022) 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022- Abstract
- We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video. Applications of physics-based reasoning in human motion analysis have so far been limited, both by the complexity of constructing adequate physical models of articulated human motion, and by the formidable challenges of performing stable and efficient inference with physics in the loop. We jointly address such modeling and inference challenges by proposing an approach that combines a physically plausible body representation with anatomical joint limits, a differentiable physics simulator, and optimization techniques that ensure good performance and robustness to suboptimal local optima. In contrast to several recent methods... (More)
- We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video. Applications of physics-based reasoning in human motion analysis have so far been limited, both by the complexity of constructing adequate physical models of articulated human motion, and by the formidable challenges of performing stable and efficient inference with physics in the loop. We jointly address such modeling and inference challenges by proposing an approach that combines a physically plausible body representation with anatomical joint limits, a differentiable physics simulator, and optimization techniques that ensure good performance and robustness to suboptimal local optima. In contrast to several recent methods [39], [42], [55], our approach readily supports full-body contact including interactions with objects in the scene. Most importantly, our model connects end-to-end with images, thus supporting direct gradient-based physics optimization by means of image-based loss functions. We validate the model by demonstrating that it can accurately reconstruct physically plausible 3d human motion from monocular video, both on public benchmarks with available 3d ground-truth, and on videos from the internet. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/0b0162d1-762b-4d78-bc10-fa8373b7af80
- author
- Gärtner, Erik
LU
; Andriluka, Mykhaylo ; Coumans, Erwin and Sminchisescu, Cristian LU
- organization
- publishing date
- 2022
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- conference name
- 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
- conference location
- New Orleans, United States
- conference dates
- 2022-06-19 - 2022-06-24
- external identifiers
-
- scopus:85132486279
- ISBN
- 978-1-6654-6946-3
- 978-1-6654-6947-0
- DOI
- 10.1109/CVPR52688.2022.01284
- language
- English
- LU publication?
- yes
- id
- 0b0162d1-762b-4d78-bc10-fa8373b7af80
- date added to LUP
- 2022-05-06 10:49:05
- date last changed
- 2025-04-18 23:06:32
@inproceedings{0b0162d1-762b-4d78-bc10-fa8373b7af80, abstract = {{We introduce DiffPhy, a differentiable physics-based model for articulated 3d human motion reconstruction from video. Applications of physics-based reasoning in human motion analysis have so far been limited, both by the complexity of constructing adequate physical models of articulated human motion, and by the formidable challenges of performing stable and efficient inference with physics in the loop. We jointly address such modeling and inference challenges by proposing an approach that combines a physically plausible body representation with anatomical joint limits, a differentiable physics simulator, and optimization techniques that ensure good performance and robustness to suboptimal local optima. In contrast to several recent methods [39], [42], [55], our approach readily supports full-body contact including interactions with objects in the scene. Most importantly, our model connects end-to-end with images, thus supporting direct gradient-based physics optimization by means of image-based loss functions. We validate the model by demonstrating that it can accurately reconstruct physically plausible 3d human motion from monocular video, both on public benchmarks with available 3d ground-truth, and on videos from the internet.}}, author = {{Gärtner, Erik and Andriluka, Mykhaylo and Coumans, Erwin and Sminchisescu, Cristian}}, booktitle = {{Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}}, isbn = {{978-1-6654-6946-3}}, language = {{eng}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, title = {{Differentiable Dynamics for Articulated 3d Human Motion Reconstruction}}, url = {{http://dx.doi.org/10.1109/CVPR52688.2022.01284}}, doi = {{10.1109/CVPR52688.2022.01284}}, year = {{2022}}, }