Safe reinforcement learning on the constraint manifold : theory and applications

Liu, Puze; Bou-Ammar, Haitham; Peters, Jan; Tateo, Davide

Safe reinforcement learning on the constraint manifold : theory and applications

Mark

Liu, Puze ; Bou-Ammar, Haitham ; Peters, Jan and Tateo, Davide ^LU

(2025) In IEEE Transactions on Robotics 41. p.3442-3461

Abstract: Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the... (More); Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/1bfe70f6-37e7-4657-9484-b9bb38f61d32

author

Liu, Puze ; Bou-Ammar, Haitham ; Peters, Jan and Tateo, Davide ^LU

publishing date

2025

type

Contribution to journal

publication status

published

subject

keywords

Constraint manifold, Robot air hockey, Safe exploration, SafeExp, Safe reinforcement learning, SafeRL

in

IEEE Transactions on Robotics

volume

41

pages

20 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

external identifiers

scopus:105004591027

ISSN

1552-3098

DOI

10.1109/TRO.2025.3567477

language

English

LU publication?

no

id

1bfe70f6-37e7-4657-9484-b9bb38f61d32

date added to LUP

2025-10-16 14:10:10

date last changed

2025-11-03 16:08:23

@article{1bfe70f6-37e7-4657-9484-b9bb38f61d32,
  abstract     = {{<p>Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. Most of the existing approaches rely on training in carefully calibrated simulators before being deployed on real robots, often without real-world fine-tuning. While effective in controlled settings, this framework falls short in applications where precise simulation is unavailable or the environment is too complex to model. Instead, on-robot learning, which learns by interacting directly with the real world, offers a promising alternative. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and nonlinear, making safety challenging to guarantee in learning systems. In this article, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the constraint manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world robot air hockey task, showing that our method can handle high-dimensional tasks with complex constraints.</p>}},
  author       = {{Liu, Puze and Bou-Ammar, Haitham and Peters, Jan and Tateo, Davide}},
  issn         = {{1552-3098}},
  keywords     = {{Constraint manifold; Robot air hockey; Safe exploration; SafeExp; Safe reinforcement learning; SafeRL}},
  language     = {{eng}},
  pages        = {{3442--3461}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Robotics}},
  title        = {{Safe reinforcement learning on the constraint manifold : theory and applications}},
  url          = {{http://dx.doi.org/10.1109/TRO.2025.3567477}},
  doi          = {{10.1109/TRO.2025.3567477}},
  volume       = {{41}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Safe reinforcement learning on the constraint manifold : theory and applications