Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Moliner, Olivier; Huang, Sangxia; Astrom, Kalle

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Mark

Moliner, Olivier ^LU

; Huang, Sangxia and Astrom, Kalle ^LU

(2022) 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022 In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2022-June. p.4153-4163

Abstract: In this work, we study self-supervised representation learning for 3D skeleton-based action recognition. We extend Bootstrap Your Own Latent (BYOL) for representation learning on skeleton sequence data and propose a new data augmentation strategy including two asymmetric transformation pipelines. We also introduce a multi-viewpoint sampling method that leverages multiple viewing angles of the same action captured by different cameras. In the semi-supervised setting, we show that the performance can be further improved by knowledge distillation from wider networks, leveraging once more the unlabeled samples. We conduct extensive experiments on the NTU-60, NTU-120 and PKU-MMD datasets to demonstrate the performance of our proposed method.... (More); In this work, we study self-supervised representation learning for 3D skeleton-based action recognition. We extend Bootstrap Your Own Latent (BYOL) for representation learning on skeleton sequence data and propose a new data augmentation strategy including two asymmetric transformation pipelines. We also introduce a multi-viewpoint sampling method that leverages multiple viewing angles of the same action captured by different cameras. In the semi-supervised setting, we show that the performance can be further improved by knowledge distillation from wider networks, leveraging once more the unlabeled samples. We conduct extensive experiments on the NTU-60, NTU-120 and PKU-MMD datasets to demonstrate the performance of our proposed method. Our method consistently outperforms the current state of the art on linear evaluation, semi-supervised and transfer learning benchmarks.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8f121d1e-204c-4910-a1b0-75f138add4d4

author

Moliner, Olivier ^LU

; Huang, Sangxia and Astrom, Kalle ^LU

organization

publishing date

2022

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer graphics and computer vision

host publication

Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022

series title

IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

volume

2022-June

pages

11 pages

publisher

IEEE Computer Society

conference name

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022

conference location

New Orleans, United States

conference dates

2022-06-19 - 2022-06-20

external identifiers

scopus:85137757589

ISSN

2160-7508

2160-7516

ISBN

9781665487399

DOI

10.1109/CVPRW56347.2022.00460

language

English

LU publication?

yes

id

8f121d1e-204c-4910-a1b0-75f138add4d4

date added to LUP

2022-11-30 11:21:47

date last changed

2025-07-24 09:57:54

@inproceedings{8f121d1e-204c-4910-a1b0-75f138add4d4,
  abstract     = {{<p>In this work, we study self-supervised representation learning for 3D skeleton-based action recognition. We extend Bootstrap Your Own Latent (BYOL) for representation learning on skeleton sequence data and propose a new data augmentation strategy including two asymmetric transformation pipelines. We also introduce a multi-viewpoint sampling method that leverages multiple viewing angles of the same action captured by different cameras. In the semi-supervised setting, we show that the performance can be further improved by knowledge distillation from wider networks, leveraging once more the unlabeled samples. We conduct extensive experiments on the NTU-60, NTU-120 and PKU-MMD datasets to demonstrate the performance of our proposed method. Our method consistently outperforms the current state of the art on linear evaluation, semi-supervised and transfer learning benchmarks.</p>}},
  author       = {{Moliner, Olivier and Huang, Sangxia and Astrom, Kalle}},
  booktitle    = {{Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022}},
  isbn         = {{9781665487399}},
  issn         = {{2160-7508}},
  language     = {{eng}},
  pages        = {{4153--4163}},
  publisher    = {{IEEE Computer Society}},
  series       = {{IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops}},
  title        = {{Bootstrapped Representation Learning for Skeleton-Based Action Recognition}},
  url          = {{http://dx.doi.org/10.1109/CVPRW56347.2022.00460}},
  doi          = {{10.1109/CVPRW56347.2022.00460}},
  volume       = {{2022-June}},
  year         = {{2022}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Bootstrapped Representation Learning for Skeleton-Based Action Recognition