Learning of Parameters in Behavior Trees for Movement Skills
(2021) IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021 p.7572-7572- Abstract
- Reinforcement Learning (RL) is a powerful mathematical framework that allows robots to learn complex skills by trial-and-error. Despite numerous successes in many applications, RL algorithms still require thousands of trials to converge to high-performing policies, can produce dangerous behaviors while learning, and the optimized policies (usually modeled as neural networks) give almost zero explanation when they fail to perform the task. For these reasons, the adoption of RL in industrial settings is not common. Behavior Trees (BTs), on the other hand, can provide a policy representation that a) supports modular and composable skills, b) allows for easy interpretation of the robot actions, and c) provides an advantageous low-dimensional... (More)
- Reinforcement Learning (RL) is a powerful mathematical framework that allows robots to learn complex skills by trial-and-error. Despite numerous successes in many applications, RL algorithms still require thousands of trials to converge to high-performing policies, can produce dangerous behaviors while learning, and the optimized policies (usually modeled as neural networks) give almost zero explanation when they fail to perform the task. For these reasons, the adoption of RL in industrial settings is not common. Behavior Trees (BTs), on the other hand, can provide a policy representation that a) supports modular and composable skills, b) allows for easy interpretation of the robot actions, and c) provides an advantageous low-dimensional parameter space. In this paper, we present a novel algorithm that can learn the parameters of a BT policy in simulation and then generalize to the physical robot without any additional training. We leverage a physical simulator with a digital twin of our workstation, and optimize the relevant parameters with a black-box optimizer. We showcase the efficacy of our method with a 7-DOF KUKAiiwa manipulator in a task that includes obstacle avoidance and a contact-rich insertion (peg-in-hole), in which our method outperforms the baselines. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/091ae119-c2a1-4e1c-9d7c-e20c1ed351f5
- author
- Mayr, Matthias LU ; Ahmad, Faseeh LU ; Chatzilygeroudis, Konstantinos ; Nardi, Luigi LU and Krueger, Volker LU
- organization
- publishing date
- 2021-12-16
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- Reinforcement Learning, Robotics, Skills, Manipulators, Bayesian Optimization, Learning
- host publication
- 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- pages
- 7 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- conference name
- IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021
- conference location
- Prague, Czech Republic
- conference dates
- 2021-09-27 - 2021-10-01
- external identifiers
-
- scopus:85124333353
- ISBN
- 978-1-6654-1714-3
- 978-1-6654-1715-0
- DOI
- 10.1109/IROS51168.2021.9636292
- project
- RobotLab LTH
- Efficient Learning of Robot Skills
- Robotics and Semantic Systems
- WASP Package: Bayesian optimization methods and their applications to real-world problems
- WASP Professor Package: Cognitive Robots for Manufacturing
- language
- English
- LU publication?
- yes
- id
- 091ae119-c2a1-4e1c-9d7c-e20c1ed351f5
- date added to LUP
- 2022-01-14 13:58:50
- date last changed
- 2024-04-18 05:14:13
@inproceedings{091ae119-c2a1-4e1c-9d7c-e20c1ed351f5, abstract = {{Reinforcement Learning (RL) is a powerful mathematical framework that allows robots to learn complex skills by trial-and-error. Despite numerous successes in many applications, RL algorithms still require thousands of trials to converge to high-performing policies, can produce dangerous behaviors while learning, and the optimized policies (usually modeled as neural networks) give almost zero explanation when they fail to perform the task. For these reasons, the adoption of RL in industrial settings is not common. Behavior Trees (BTs), on the other hand, can provide a policy representation that a) supports modular and composable skills, b) allows for easy interpretation of the robot actions, and c) provides an advantageous low-dimensional parameter space. In this paper, we present a novel algorithm that can learn the parameters of a BT policy in simulation and then generalize to the physical robot without any additional training. We leverage a physical simulator with a digital twin of our workstation, and optimize the relevant parameters with a black-box optimizer. We showcase the efficacy of our method with a 7-DOF KUKAiiwa manipulator in a task that includes obstacle avoidance and a contact-rich insertion (peg-in-hole), in which our method outperforms the baselines.}}, author = {{Mayr, Matthias and Ahmad, Faseeh and Chatzilygeroudis, Konstantinos and Nardi, Luigi and Krueger, Volker}}, booktitle = {{2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}}, isbn = {{978-1-6654-1714-3}}, keywords = {{Reinforcement Learning; Robotics; Skills; Manipulators; Bayesian Optimization; Learning}}, language = {{eng}}, month = {{12}}, pages = {{7572--7572}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, title = {{Learning of Parameters in Behavior Trees for Movement Skills}}, url = {{http://dx.doi.org/10.1109/IROS51168.2021.9636292}}, doi = {{10.1109/IROS51168.2021.9636292}}, year = {{2021}}, }