Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Safe reinforcement learning on the constraint manifold : theory and applications

Liu, Puze ; Bou-Ammar, Haitham ; Peters, Jan and Tateo, Davide LU orcid (2025) In IEEE Transactions on Robotics 41. p.3442-3461
Abstract

Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the... (More)

Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.

(Less)
Please use this url to cite or link to this publication:
author
; ; and
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Constraint manifold, Robot air hockey, Safe exploration, SafeExp, Safe reinforcement learning, SafeRL
in
IEEE Transactions on Robotics
volume
41
pages
20 pages
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
external identifiers
  • scopus:105004591027
ISSN
1552-3098
DOI
10.1109/TRO.2025.3567477
language
English
LU publication?
no
id
1bfe70f6-37e7-4657-9484-b9bb38f61d32
date added to LUP
2025-10-16 14:10:10
date last changed
2025-11-03 16:08:23
@article{1bfe70f6-37e7-4657-9484-b9bb38f61d32,
  abstract     = {{<p>Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.</p>}},
  author       = {{Liu, Puze and Bou-Ammar, Haitham and Peters, Jan and Tateo, Davide}},
  issn         = {{1552-3098}},
  keywords     = {{Constraint manifold; Robot air hockey; Safe exploration; SafeExp; Safe reinforcement learning; SafeRL}},
  language     = {{eng}},
  pages        = {{3442--3461}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Robotics}},
  title        = {{Safe reinforcement learning on the constraint manifold : theory and applications}},
  url          = {{http://dx.doi.org/10.1109/TRO.2025.3567477}},
  doi          = {{10.1109/TRO.2025.3567477}},
  volume       = {{41}},
  year         = {{2025}},
}