Advanced

Three-dimensional reconstruction of human interactions

Fieraru, Mihai ; Zanfir, Mihai ; Oneata, Elisabeta ; Popa, Alin Ionut ; Olaru, Vlad and Sminchisescu, Cristian LU (2020) 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition p.7212-7221
Abstract

Understanding 3d human interactions is fundamental for fine grained scene analysis and behavioural modeling. However, most of the existing models focus on analyzing a single person in isolation, and those who process several people focus largely on resolving multi-person data association, rather than inferring interactions. This may lead to incorrect, lifeless 3d estimates, that miss the subtle human contact aspects–the essence of the event–and are of little use for detailed behavioral understanding. This paper addresses such issues and makes several contributions: (1) we introduce models for interaction signature estimation (ISP) encompassing contact detection, segmentation, and 3d contact signature prediction; (2) we show how such... (More)

Understanding 3d human interactions is fundamental for fine grained scene analysis and behavioural modeling. However, most of the existing models focus on analyzing a single person in isolation, and those who process several people focus largely on resolving multi-person data association, rather than inferring interactions. This may lead to incorrect, lifeless 3d estimates, that miss the subtle human contact aspects–the essence of the event–and are of little use for detailed behavioral understanding. This paper addresses such issues and makes several contributions: (1) we introduce models for interaction signature estimation (ISP) encompassing contact detection, segmentation, and 3d contact signature prediction; (2) we show how such components can be leveraged in order to produce augmented losses that ensure contact consistency during 3d reconstruction; (3) we construct several large datasets for learning and evaluating 3d contact prediction and reconstruction methods; specifically, we introduce CHI3D, a lab-based accurate 3d motion capture dataset with 631 sequences containing 2, 525 contact events, 728, 664 ground truth 3d poses, as well as FlickrCI3D, a dataset of 11, 216 images, with 14, 081 processed pairs of people, and 81, 233 facet-level surface correspondences within 138, 213 selected contact regions. Finally, (4) we present models and baselines to illustrate how contact estimation supports meaningful 3d reconstruction where essential interactions are captured. Models and data are made available for research purposes at http://vision.imar.ro/ci3d.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
series title
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
pages
10 pages
publisher
Institute of Electrical and Electronics Engineers Inc.
conference name
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
conference location
Virtual, Online, United States
conference dates
2020-06-14 - 2020-06-19
external identifiers
  • scopus:85094601522
ISSN
1063-6919
ISBN
978-1-7281-7168-5
DOI
10.1109/CVPR42600.2020.00724
language
English
LU publication?
yes
id
4256fa9d-acf2-41da-9622-571faedda33a
date added to LUP
2020-11-23 08:46:48
date last changed
2020-12-29 04:19:18
@inproceedings{4256fa9d-acf2-41da-9622-571faedda33a,
  abstract     = {<p>Understanding 3d human interactions is fundamental for fine grained scene analysis and behavioural modeling. However, most of the existing models focus on analyzing a single person in isolation, and those who process several people focus largely on resolving multi-person data association, rather than inferring interactions. This may lead to incorrect, lifeless 3d estimates, that miss the subtle human contact aspects–the essence of the event–and are of little use for detailed behavioral understanding. This paper addresses such issues and makes several contributions: (1) we introduce models for interaction signature estimation (ISP) encompassing contact detection, segmentation, and 3d contact signature prediction; (2) we show how such components can be leveraged in order to produce augmented losses that ensure contact consistency during 3d reconstruction; (3) we construct several large datasets for learning and evaluating 3d contact prediction and reconstruction methods; specifically, we introduce CHI3D, a lab-based accurate 3d motion capture dataset with 631 sequences containing 2, 525 contact events, 728, 664 ground truth 3d poses, as well as FlickrCI3D, a dataset of 11, 216 images, with 14, 081 processed pairs of people, and 81, 233 facet-level surface correspondences within 138, 213 selected contact regions. Finally, (4) we present models and baselines to illustrate how contact estimation supports meaningful 3d reconstruction where essential interactions are captured. Models and data are made available for research purposes at http://vision.imar.ro/ci3d.</p>},
  author       = {Fieraru, Mihai and Zanfir, Mihai and Oneata, Elisabeta and Popa, Alin Ionut and Olaru, Vlad and Sminchisescu, Cristian},
  booktitle    = {2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  isbn         = {978-1-7281-7168-5},
  issn         = {1063-6919},
  language     = {eng},
  pages        = {7212--7221},
  publisher    = {Institute of Electrical and Electronics Engineers Inc.},
  series       = {Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition},
  title        = {Three-dimensional reconstruction of human interactions},
  url          = {http://dx.doi.org/10.1109/CVPR42600.2020.00724},
  doi          = {10.1109/CVPR42600.2020.00724},
  year         = {2020},
}