3D Human Pose and Shape Estimation Through Collaborative Learning and Multi-View Model-Fitting
(2021) 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) In IEEE Winter Conference on Applications of Computer Vision (WACV) p.1887-1896- Abstract
- 3D human pose and shape estimation plays a vital role in many computer vision applications. There are many deep learning based methods attempting to solve the problem only relying on single-view RGB images for training the network. However, since some public datasets are captured from multi-view cameras system, we propose a novel method to tackle the problem by putting optimization-based multi-view model-fitting into a regression-based learning loop from multi-view images. Firstly, a convolutional neural network (CNN) regresses the pose and shape of a parametric human body model (SMPL) from multi-view images. Then, utilizing the regressed pose and shape as initialization, we propose an improved multi-view optimization method based on the... (More)
- 3D human pose and shape estimation plays a vital role in many computer vision applications. There are many deep learning based methods attempting to solve the problem only relying on single-view RGB images for training the network. However, since some public datasets are captured from multi-view cameras system, we propose a novel method to tackle the problem by putting optimization-based multi-view model-fitting into a regression-based learning loop from multi-view images. Firstly, a convolutional neural network (CNN) regresses the pose and shape of a parametric human body model (SMPL) from multi-view images. Then, utilizing the regressed pose and shape as initialization, we propose an improved multi-view optimization method based on the SMPLify method (MV-SMPLify) to fit the SMPL model to the multi-view images simultaneously. Subsequently, the optimized parameters can be adopted to supervise the training of the CNN model. This whole process forms a self-supervising framework which can combine the advantages of the CNN approach and the optimization-based approach through a collaborative process. In addition, the multi-view images can provide more comprehensive supervision for the training. Experiments on public datasets qualitatively and quantitatively demonstrate that our method outperforms previous approaches in a number of ways. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/b3bd685a-6af5-4bf7-8d68-a8e211dddb8d
- author
- Li, Zhongguo LU ; Oskarsson, Magnus LU and Heyden, Anders LU
- organization
- publishing date
- 2021
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- WACV - IEEE Winter Conference on Applications of Computer Vision
- series title
- IEEE Winter Conference on Applications of Computer Vision (WACV)
- pages
- 10 pages
- publisher
- IEEE Computer Society
- conference name
- 2021 IEEE Winter Conference on Applications of Computer Vision (WACV)
- conference location
- Waikoloa, United States
- conference dates
- 2021-01-03 - 2021-01-08
- external identifiers
-
- scopus:85116152775
- ISSN
- 2642-9381
- ISBN
- 978-1-6654-4640-2
- 978-1-6654-0477-8
- DOI
- 10.1109/WACV48630.2021.00193
- language
- English
- LU publication?
- yes
- id
- b3bd685a-6af5-4bf7-8d68-a8e211dddb8d
- date added to LUP
- 2021-04-26 04:07:45
- date last changed
- 2024-09-22 18:21:14
@inproceedings{b3bd685a-6af5-4bf7-8d68-a8e211dddb8d, abstract = {{3D human pose and shape estimation plays a vital role in many computer vision applications. There are many deep learning based methods attempting to solve the problem only relying on single-view RGB images for training the network. However, since some public datasets are captured from multi-view cameras system, we propose a novel method to tackle the problem by putting optimization-based multi-view model-fitting into a regression-based learning loop from multi-view images. Firstly, a convolutional neural network (CNN) regresses the pose and shape of a parametric human body model (SMPL) from multi-view images. Then, utilizing the regressed pose and shape as initialization, we propose an improved multi-view optimization method based on the SMPLify method (MV-SMPLify) to fit the SMPL model to the multi-view images simultaneously. Subsequently, the optimized parameters can be adopted to supervise the training of the CNN model. This whole process forms a self-supervising framework which can combine the advantages of the CNN approach and the optimization-based approach through a collaborative process. In addition, the multi-view images can provide more comprehensive supervision for the training. Experiments on public datasets qualitatively and quantitatively demonstrate that our method outperforms previous approaches in a number of ways.}}, author = {{Li, Zhongguo and Oskarsson, Magnus and Heyden, Anders}}, booktitle = {{WACV - IEEE Winter Conference on Applications of Computer Vision}}, isbn = {{978-1-6654-4640-2}}, issn = {{2642-9381}}, language = {{eng}}, pages = {{1887--1896}}, publisher = {{IEEE Computer Society}}, series = {{IEEE Winter Conference on Applications of Computer Vision (WACV)}}, title = {{3D Human Pose and Shape Estimation Through Collaborative Learning and Multi-View Model-Fitting}}, url = {{http://dx.doi.org/10.1109/WACV48630.2021.00193}}, doi = {{10.1109/WACV48630.2021.00193}}, year = {{2021}}, }