Interpretability in Contact-Rich Manipulation via Kinodynamic Images

Mitsioni, Ioanna; Mänttäri, Joonatan; Karayiannidis, Yiannis; Folkesson, John; Kragic, Danica

Interpretability in Contact-Rich Manipulation via Kinodynamic Images

Mark

Mitsioni, Ioanna ; Mänttäri, Joonatan ; Karayiannidis, Yiannis ^LU

; Folkesson, John and Kragic, Danica (2021) p.10175-10181

Abstract: Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from kinematic and dynamic data of contact-rich manipulation tasks. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train a Convolutional Neural Network (CNN) and we extract interpretations with... (More); Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from kinematic and dynamic data of contact-rich manipulation tasks. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train a Convolutional Neural Network (CNN) and we extract interpretations with Grad-CAM to produce visual explanations. Our method is versatile and can be applied to any classification problem in manipulation tasks to visually interpret which parts of the input drive the model’s decisions and distinguish its failure modes, regardless of the features used. Our experiments demonstrate that our method enables detailed visual inspections of sequences in a task, and high-level evaluations of a model’s behavior. Code for this work is available at [1]. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/7419e5d7-76e2-4c7a-bdba-22dbc6a0ad36

author

Mitsioni, Ioanna ; Mänttäri, Joonatan ; Karayiannidis, Yiannis ^LU

; Folkesson, John and Kragic, Danica

publishing date

2021

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Control Engineering

host publication

2021 IEEE International Conference on Robotics and Automation (ICRA)

pages

7 pages

external identifiers

scopus:85104066830

DOI

10.1109/ICRA48506.2021.9560920

language

English

LU publication?

no

id

7419e5d7-76e2-4c7a-bdba-22dbc6a0ad36

date added to LUP

2022-12-14 15:09:20

date last changed

2025-10-14 11:11:43

@inproceedings{7419e5d7-76e2-4c7a-bdba-22dbc6a0ad36,
  abstract     = {{Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from kinematic and dynamic data of contact-rich manipulation tasks. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train a Convolutional Neural Network (CNN) and we extract interpretations with Grad-CAM to produce visual explanations. Our method is versatile and can be applied to any classification problem in manipulation tasks to visually interpret which parts of the input drive the model’s decisions and distinguish its failure modes, regardless of the features used. Our experiments demonstrate that our method enables detailed visual inspections of sequences in a task, and high-level evaluations of a model’s behavior. Code for this work is available at [1].}},
  author       = {{Mitsioni, Ioanna and Mänttäri, Joonatan and Karayiannidis, Yiannis and Folkesson, John and Kragic, Danica}},
  booktitle    = {{2021 IEEE International Conference on Robotics and Automation (ICRA)}},
  language     = {{eng}},
  pages        = {{10175--10181}},
  title        = {{Interpretability in Contact-Rich Manipulation via Kinodynamic Images}},
  url          = {{http://dx.doi.org/10.1109/ICRA48506.2021.9560920}},
  doi          = {{10.1109/ICRA48506.2021.9560920}},
  year         = {{2021}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Interpretability in Contact-Rich Manipulation via Kinodynamic Images