Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Distributed Reinforcement Learning for Building Energy Optimization

Javeed, Arshad (2024)
Department of Automatic Control
Abstract
Heating, ventilation, and air-conditioning (HVAC) systems are ubiquitous and are one of the major components responsible for energy consumption in a typical building system. In light of the imminent energy crisis, there is an increasing demand to revisit the building systems and save as much energy as possible. In this regard, the scope of the thesis is to explore opportunities for data-driven optimization of HVAC systems. Traditionally, optimization in HVAC systems has relied on offline optimization requiring domain expertise to schedule a set of optimal controls, but such approaches often require extensive domain expertise. Another drawback is that as the system changes over time, these controls go out of tune and need to be adapted.... (More)
Heating, ventilation, and air-conditioning (HVAC) systems are ubiquitous and are one of the major components responsible for energy consumption in a typical building system. In light of the imminent energy crisis, there is an increasing demand to revisit the building systems and save as much energy as possible. In this regard, the scope of the thesis is to explore opportunities for data-driven optimization of HVAC systems. Traditionally, optimization in HVAC systems has relied on offline optimization requiring domain expertise to schedule a set of optimal controls, but such approaches often require extensive domain expertise. Another drawback is that as the system changes over time, these controls go out of tune and need to be adapted. Thus, data-driven optimization approaches (such as reinforcement learning) appear more appealing due to their ability to adapt online and model and solve complex problems.
A primary objective of the thesis is to carry out exploratory data analysis experiments to quantify the savings potential and expose the optimization space for RL. The main objective is to explore distributed reinforcement learning via multi-agent RL strategies (MARL) and compare and contrast the pros and cons of MARL with single-agent RL. The work benchmarks two of the popular contemporary MARL strategies, centralized training and decentralized execution, and value-mixing approaches, along with proposing two novel MARL enhancements in HVAC systems: a linear value-mixing strategy (inspired by Q-function mixing, QMIX) and turnbased games, that attempt to alleviate some of the problems of multi-agent credit assignment and non-stationarity surrounding MARL.
The experimental results include the learning performance of various RL strategies and the performance benchmarks against the closed-loop controller under realistic conditions. The experimental results reveal that the RL strategies perform significantly better than the closed-loop controller (with a few exceptions), achieving power savings of up to 15% on yearly simulations with live weather profiles. The results also highlight the tradeoffs between optimality and sampling efficiency, further corroborating the prejudice about MARL, where the single-agent RL performs better in terms of optimality, while the MARL approach displays faster learning. (Less)
Please use this url to cite or link to this publication:
author
Javeed, Arshad
supervisor
organization
year
type
H3 - Professional qualifications (4 Years - )
subject
report number
TFRT-6242
other publication id
0280-5316
language
English
id
9174186
date added to LUP
2024-09-13 13:19:54
date last changed
2024-09-13 13:19:54
@misc{9174186,
  abstract     = {{Heating, ventilation, and air-conditioning (HVAC) systems are ubiquitous and are one of the major components responsible for energy consumption in a typical building system. In light of the imminent energy crisis, there is an increasing demand to revisit the building systems and save as much energy as possible. In this regard, the scope of the thesis is to explore opportunities for data-driven optimization of HVAC systems. Traditionally, optimization in HVAC systems has relied on offline optimization requiring domain expertise to schedule a set of optimal controls, but such approaches often require extensive domain expertise. Another drawback is that as the system changes over time, these controls go out of tune and need to be adapted. Thus, data-driven optimization approaches (such as reinforcement learning) appear more appealing due to their ability to adapt online and model and solve complex problems.
 A primary objective of the thesis is to carry out exploratory data analysis experiments to quantify the savings potential and expose the optimization space for RL. The main objective is to explore distributed reinforcement learning via multi-agent RL strategies (MARL) and compare and contrast the pros and cons of MARL with single-agent RL. The work benchmarks two of the popular contemporary MARL strategies, centralized training and decentralized execution, and value-mixing approaches, along with proposing two novel MARL enhancements in HVAC systems: a linear value-mixing strategy (inspired by Q-function mixing, QMIX) and turnbased games, that attempt to alleviate some of the problems of multi-agent credit assignment and non-stationarity surrounding MARL.
 The experimental results include the learning performance of various RL strategies and the performance benchmarks against the closed-loop controller under realistic conditions. The experimental results reveal that the RL strategies perform significantly better than the closed-loop controller (with a few exceptions), achieving power savings of up to 15% on yearly simulations with live weather profiles. The results also highlight the tradeoffs between optimality and sampling efficiency, further corroborating the prejudice about MARL, where the single-agent RL performs better in terms of optimality, while the MARL approach displays faster learning.}},
  author       = {{Javeed, Arshad}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Distributed Reinforcement Learning for Building Energy Optimization}},
  year         = {{2024}},
}