Robot Reinforcement Learning for Object Isolation

Åstrand, Teodor

Robot Reinforcement Learning for Object Isolation

Mark

Åstrand, Teodor (2024)
Department of Automatic Control

Abstract: This thesis employs deep reinforcement learning, a branch of machine learning, to carry out robotic tasks. The objective centers around teaching an agent controlling a seven-axis robot arm with a gripper tool, to complete an object isolation task. For this task, a robot manipulates a cluttered environment in such a way that a predetermined target object becomes isolated. Sub-tasks were developed to explore simpler robot tasks to evolve and combine into more complex tasks, where the goal was the object isolation task. Agent training took place in a simulated robot learning environment with the use of primarily a coordinate-based low dimensional statespace, where reward shaping was the primary tool to teach a given task.
The reinforcement... (More); This thesis employs deep reinforcement learning, a branch of machine learning, to carry out robotic tasks. The objective centers around teaching an agent controlling a seven-axis robot arm with a gripper tool, to complete an object isolation task. For this task, a robot manipulates a cluttered environment in such a way that a predetermined target object becomes isolated. Sub-tasks were developed to explore simpler robot tasks to evolve and combine into more complex tasks, where the goal was the object isolation task. Agent training took place in a simulated robot learning environment with the use of primarily a coordinate-based low dimensional statespace, where reward shaping was the primary tool to teach a given task.
The reinforcement learning algorithm Proximal policy optimization (PPO) implemented with a neural network architecture was used to train agents for the robotics tasks and the robot arm’s joint velocities were used as the action-space for the agents. Multiple experiments were conducted for agents practicing different tasks and their performance was evaluated by measuring their task completion rate and rendering their behavior among others. Agents developed policies capable of different forms of cube manipulation and performing cube extraction tasks. Multiple different policies for completing robot tasks were learned, and their strategies were evaluated and discussed. (Less)

- Open Access
- |
- PDF

Links

Document download statistics

Related Materials

Related object is popular science:
Popular Science summary

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9174579

author

Åstrand, Teodor

supervisor

Yiannis Karayiannidis ^LU
Björn Olofsson ^LU

organization

Department of Automatic Control

year

2024

type

H3 - Professional qualifications (4 Years - )

subject

Technology and Engineering

report number

TFRT-6255

other publication id

0280-5316

language

English

id

9174579

date added to LUP

2024-09-16 08:50:14

date last changed

2024-09-16 08:50:14

@misc{9174579,
  abstract     = {{This thesis employs deep reinforcement learning, a branch of machine learning, to carry out robotic tasks. The objective centers around teaching an agent controlling a seven-axis robot arm with a gripper tool, to complete an object isolation task. For this task, a robot manipulates a cluttered environment in such a way that a predetermined target object becomes isolated. Sub-tasks were developed to explore simpler robot tasks to evolve and combine into more complex tasks, where the goal was the object isolation task. Agent training took place in a simulated robot learning environment with the use of primarily a coordinate-based low dimensional statespace, where reward shaping was the primary tool to teach a given task.
 The reinforcement learning algorithm Proximal policy optimization (PPO) implemented with a neural network architecture was used to train agents for the robotics tasks and the robot arm’s joint velocities were used as the action-space for the agents. Multiple experiments were conducted for agents practicing different tasks and their performance was evaluated by measuring their task completion rate and rendering their behavior among others. Agents developed policies capable of different forms of cube manipulation and performing cube extraction tasks. Multiple different policies for completing robot tasks were learned, and their strategies were evaluated and discussed.}},
  author       = {{Åstrand, Teodor}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Robot Reinforcement Learning for Object Isolation}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Robot Reinforcement Learning for Object Isolation