Scalable Reinforcement Learning for Linear-Quadratic Control of Networks
(2023)Department of Automatic Control
- Abstract
- Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a κ-neighbourhood of each agent. Motivated by these results, we show that similar results hold for the... (More)
- Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a κ-neighbourhood of each agent. Motivated by these results, we show that similar results hold for the agents’ individual value and action-value functions. We continue by designing an algorithm, based on the actor-critic framework, to learn distributed controllers only using local information. Specifically, the action-value function is estimated by modifying the Least Squares Temporal Difference for Q-functions method to only use local information. The algorithm then updates the policy using gradient descent. Finally, the algorithm is evaluated through simulations which suggest near-optimal performance. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9137268
- author
- Olsson, Johan
- supervisor
-
- Na Li
- Emma Tegling LU
- Anders Rantzer LU
- organization
- year
- 2023
- type
- H3 - Professional qualifications (4 Years - )
- subject
- report number
- TFRT-6207
- other publication id
- 0280-5316
- language
- English
- id
- 9137268
- date added to LUP
- 2023-09-12 14:39:33
- date last changed
- 2023-09-12 14:39:33
@misc{9137268, abstract = {{Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a κ-neighbourhood of each agent. Motivated by these results, we show that similar results hold for the agents’ individual value and action-value functions. We continue by designing an algorithm, based on the actor-critic framework, to learn distributed controllers only using local information. Specifically, the action-value function is estimated by modifying the Least Squares Temporal Difference for Q-functions method to only use local information. The algorithm then updates the policy using gradient descent. Finally, the algorithm is evaluated through simulations which suggest near-optimal performance.}}, author = {{Olsson, Johan}}, language = {{eng}}, note = {{Student Paper}}, title = {{Scalable Reinforcement Learning for Linear-Quadratic Control of Networks}}, year = {{2023}}, }