Scalable Reinforcement Learning for Linear-Quadratic Control of Networks

Olsson, Johan; Zhang, Runyu(Cathy); Tegling, Emma; Li, Na

Scalable Reinforcement Learning for Linear-Quadratic Control of Networks

Mark

Olsson, Johan ; Zhang, Runyu(Cathy) ; Tegling, Emma ^LU and Li, Na ^LU (2024) 2024 American Control Conference, ACC 2024 In Proceedings of the American Control Conference p.1813-1818

Abstract: Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a kq -neighborhood of each agent. Motivated by these results, we show that similar results hold for... (More); Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a kq -neighborhood of each agent. Motivated by these results, we show that similar results hold for the agents' individual value and Q-functions. We continue by designing an algorithm, based on the actor-critic framework, to learn distributed controllers only using local information. Specifically, the Q-function is estimated by modifying the Least Squares Temporal Difference for Q-functions method to only use local information. The algorithm then updates the policy using gradient descent. Finally, we evaluate the algorithm through simulations that indeed suggest near-optimal performance.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/bdedde28-6d49-4dae-afaa-11dfdbdffe8b

author

Olsson, Johan ; Zhang, Runyu(Cathy) ; Tegling, Emma ^LU and Li, Na ^LU

organization

publishing date

2024

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Control Engineering

host publication

Proceedings of the American Control Conference

series title

Proceedings of the American Control Conference

pages

6 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

2024 American Control Conference, ACC 2024

conference location

Toronto, Canada

conference dates

2024-07-10 - 2024-07-12

external identifiers

scopus:85204427546

ISSN

0743-1619

ISBN

9798350382655

DOI

10.23919/ACC60939.2024.10644413

language

English

LU publication?

yes

id

bdedde28-6d49-4dae-afaa-11dfdbdffe8b

date added to LUP

2024-11-27 13:46:54

date last changed

2025-04-04 15:38:01

@inproceedings{bdedde28-6d49-4dae-afaa-11dfdbdffe8b,
  abstract     = {{<p>Distributed optimal control is known to be challenging and can become intractable even for linear-quadratic regulator problems. In this work, we study a special class of such problems where distributed state feedback controllers can give near-optimal performance. More specifically, we consider networked linear-quadratic controllers with decoupled costs and spatially exponentially decaying dynamics. We aim to exploit the structure in the problem to design a scalable reinforcement learning algorithm for learning a distributed controller. Recent work has shown that the optimal controller can be well approximated only using information from a kq -neighborhood of each agent. Motivated by these results, we show that similar results hold for the agents' individual value and Q-functions. We continue by designing an algorithm, based on the actor-critic framework, to learn distributed controllers only using local information. Specifically, the Q-function is estimated by modifying the Least Squares Temporal Difference for Q-functions method to only use local information. The algorithm then updates the policy using gradient descent. Finally, we evaluate the algorithm through simulations that indeed suggest near-optimal performance.</p>}},
  author       = {{Olsson, Johan and Zhang, Runyu(Cathy) and Tegling, Emma and Li, Na}},
  booktitle    = {{Proceedings of the American Control Conference}},
  isbn         = {{9798350382655}},
  issn         = {{0743-1619}},
  language     = {{eng}},
  pages        = {{1813--1818}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{Proceedings of the American Control Conference}},
  title        = {{Scalable Reinforcement Learning for Linear-Quadratic Control of Networks}},
  url          = {{http://dx.doi.org/10.23919/ACC60939.2024.10644413}},
  doi          = {{10.23919/ACC60939.2024.10644413}},
  year         = {{2024}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Scalable Reinforcement Learning for Linear-Quadratic Control of Networks