Safe reinforcement learning on the constraint manifold : theory and applications
(2025) In IEEE Transactions on Robotics 41. p.3442-3461- Abstract
Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the... (More)
Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.
(Less)
- author
- Liu, Puze
; Bou-Ammar, Haitham
; Peters, Jan
and Tateo, Davide
LU
- publishing date
- 2025
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Constraint manifold, Robot air hockey, Safe exploration, SafeExp, Safe reinforcement learning, SafeRL
- in
- IEEE Transactions on Robotics
- volume
- 41
- pages
- 20 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- external identifiers
-
- scopus:105004591027
- ISSN
- 1552-3098
- DOI
- 10.1109/TRO.2025.3567477
- language
- English
- LU publication?
- no
- id
- 1bfe70f6-37e7-4657-9484-b9bb38f61d32
- date added to LUP
- 2025-10-16 14:10:10
- date last changed
- 2025-11-03 16:08:23
@article{1bfe70f6-37e7-4657-9484-b9bb38f61d32,
abstract = {{<p>Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.</p>}},
author = {{Liu, Puze and Bou-Ammar, Haitham and Peters, Jan and Tateo, Davide}},
issn = {{1552-3098}},
keywords = {{Constraint manifold; Robot air hockey; Safe exploration; SafeExp; Safe reinforcement learning; SafeRL}},
language = {{eng}},
pages = {{3442--3461}},
publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
series = {{IEEE Transactions on Robotics}},
title = {{Safe reinforcement learning on the constraint manifold : theory and applications}},
url = {{http://dx.doi.org/10.1109/TRO.2025.3567477}},
doi = {{10.1109/TRO.2025.3567477}},
volume = {{41}},
year = {{2025}},
}