Learning Optimal Team-Decisions

Kjellqvist, Olle; Gattami, Ather

Learning Optimal Team-Decisions

Mark

and Gattami, Ather ^LU (2022) 61st IEEE Conference on Decision and Control, CDC 2022 In Proceedings of the IEEE Conference on Decision and Control 2022-December. p.1441-1446

Abstract: In this paper, we linear quadratic team decision problems, where a team of agents minimizes a convex quadratic cost function over T time steps subject to possibly distinct linear measurements of the state of nature. We assume that the state of nature is a Gaussian random variable and that the agents do not know the cost function nor the linear functions mapping the state of nature to their measurements. We present a gradient-descent based algorithm with an expected regret of O(log(T)) for full information gradient feedback and O(√(T)) for bandit feedback. In the case of bandit feedback, the expected regret has an additional multiplicative term O(d) where d reflects the number of learned parameters.

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/a224e208-58a3-4e04-b1af-100b1557c40f

author

Kjellqvist, Olle ^LU

and Gattami, Ather ^LU

organization

publishing date

2022

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Control Engineering

host publication

2022 IEEE 61st Conference on Decision and Control, CDC 2022

series title

Proceedings of the IEEE Conference on Decision and Control

volume

2022-December

pages

6 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

61st IEEE Conference on Decision and Control, CDC 2022

conference location

Cancun, Mexico

conference dates

2022-12-06 - 2022-12-09

external identifiers

scopus:85147018169

ISSN

2576-2370

0743-1546

ISBN

9781665467612

DOI

10.1109/CDC51059.2022.9992786

language

English

LU publication?

yes

id

a224e208-58a3-4e04-b1af-100b1557c40f

date added to LUP

2023-02-14 11:38:18

date last changed

2025-10-14 09:47:38

@inproceedings{a224e208-58a3-4e04-b1af-100b1557c40f,
  abstract     = {{<p>In this paper, we linear quadratic team decision problems, where a team of agents minimizes a convex quadratic cost function over T time steps subject to possibly distinct linear measurements of the state of nature. We assume that the state of nature is a Gaussian random variable and that the agents do not know the cost function nor the linear functions mapping the state of nature to their measurements. We present a gradient-descent based algorithm with an expected regret of O(log(T)) for full information gradient feedback and O(√(T)) for bandit feedback. In the case of bandit feedback, the expected regret has an additional multiplicative term O(d) where d reflects the number of learned parameters.</p>}},
  author       = {{Kjellqvist, Olle and Gattami, Ather}},
  booktitle    = {{2022 IEEE 61st Conference on Decision and Control, CDC 2022}},
  isbn         = {{9781665467612}},
  issn         = {{2576-2370}},
  language     = {{eng}},
  pages        = {{1441--1446}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{Proceedings of the IEEE Conference on Decision and Control}},
  title        = {{Learning Optimal Team-Decisions}},
  url          = {{http://dx.doi.org/10.1109/CDC51059.2022.9992786}},
  doi          = {{10.1109/CDC51059.2022.9992786}},
  volume       = {{2022-December}},
  year         = {{2022}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Learning Optimal Team-Decisions