Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Handling long-term safety and uncertainty in safe reinforcement learning

Günster, Jonas ; Liu, Puze ; Peters, Jan and Tateo, Davide LU orcid (2024) 8th Conference on Robot Learning, CoRL 2024 In Proceedings of Machine Learning Research 270. p.4670-4697
Abstract

Safety is one of the key issues preventing the deployment of reinforcement learning techniques in real-world robots. While most approaches in the Safe Reinforcement Learning area do not require prior knowledge of constraints and robot kinematics and rely solely on data, it is often difficult to deploy them in complex real-world settings. Instead, model-based approaches that incorporate prior knowledge of the constraints and dynamics into the learning framework have proven capable of deploying the learning algorithm directly on the real robot. Unfortunately, while an approximated model of the robot dynamics is often available, the safety constraints are task-specific and hard to obtain: they may be too complicated to encode analytically,... (More)

Safety is one of the key issues preventing the deployment of reinforcement learning techniques in real-world robots. While most approaches in the Safe Reinforcement Learning area do not require prior knowledge of constraints and robot kinematics and rely solely on data, it is often difficult to deploy them in complex real-world settings. Instead, model-based approaches that incorporate prior knowledge of the constraints and dynamics into the learning framework have proven capable of deploying the learning algorithm directly on the real robot. Unfortunately, while an approximated model of the robot dynamics is often available, the safety constraints are task-specific and hard to obtain: they may be too complicated to encode analytically, too expensive to compute, or it may be difficult to envision a priori the long-term safety requirements. In this paper, we bridge this gap by extending the safe exploration method, ATACOM, with learnable constraints, with a particular focus on ensuring long-term safety and handling of uncertainty. Our approach is competitive or superior to state-of-the-art methods in final performance while maintaining safer behavior during training.

(Less)
Please use this url to cite or link to this publication:
author
; ; and
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
Chance constraints, Distributional RL, Safe reinforcement learning
host publication
8th Conference on Robot Learning, CoRL 2024
series title
Proceedings of Machine Learning Research
volume
270
pages
28 pages
conference name
8th Conference on Robot Learning, CoRL 2024
conference location
Munich, Germany
conference dates
2024-11-06 - 2024-11-09
external identifiers
  • scopus:86000761633
ISSN
2640-3498
language
English
LU publication?
no
id
3b06c0c6-7807-49fe-a713-49592852c88a
date added to LUP
2025-10-16 14:07:43
date last changed
2025-10-17 12:18:23
@inproceedings{3b06c0c6-7807-49fe-a713-49592852c88a,
  abstract     = {{<p>Safety is one of the key issues preventing the deployment of reinforcement learning techniques in real-world robots. While most approaches in the Safe Reinforcement Learning area do not require prior knowledge of constraints and robot kinematics and rely solely on data, it is often difficult to deploy them in complex real-world settings. Instead, model-based approaches that incorporate prior knowledge of the constraints and dynamics into the learning framework have proven capable of deploying the learning algorithm directly on the real robot. Unfortunately, while an approximated model of the robot dynamics is often available, the safety constraints are task-specific and hard to obtain: they may be too complicated to encode analytically, too expensive to compute, or it may be difficult to envision a priori the long-term safety requirements. In this paper, we bridge this gap by extending the safe exploration method, ATACOM, with learnable constraints, with a particular focus on ensuring long-term safety and handling of uncertainty. Our approach is competitive or superior to state-of-the-art methods in final performance while maintaining safer behavior during training.</p>}},
  author       = {{Günster, Jonas and Liu, Puze and Peters, Jan and Tateo, Davide}},
  booktitle    = {{8th Conference on Robot Learning, CoRL 2024}},
  issn         = {{2640-3498}},
  keywords     = {{Chance constraints; Distributional RL; Safe reinforcement learning}},
  language     = {{eng}},
  pages        = {{4670--4697}},
  series       = {{Proceedings of Machine Learning Research}},
  title        = {{Handling long-term safety and uncertainty in safe reinforcement learning}},
  volume       = {{270}},
  year         = {{2024}},
}