Distributed Reinforcement Learning for Building Energy Optimization
(2024)Department of Automatic Control
- Abstract
- Heating, ventilation, and air-conditioning (HVAC) systems are ubiquitous and are one of the major components responsible for energy consumption in a typical building system. In light of the imminent energy crisis, there is an increasing demand to revisit the building systems and save as much energy as possible. In this regard, the scope of the thesis is to explore opportunities for data-driven optimization of HVAC systems. Traditionally, optimization in HVAC systems has relied on offline optimization requiring domain expertise to schedule a set of optimal controls, but such approaches often require extensive domain expertise. Another drawback is that as the system changes over time, these controls go out of tune and need to be adapted.... (More)
- Heating, ventilation, and air-conditioning (HVAC) systems are ubiquitous and are one of the major components responsible for energy consumption in a typical building system. In light of the imminent energy crisis, there is an increasing demand to revisit the building systems and save as much energy as possible. In this regard, the scope of the thesis is to explore opportunities for data-driven optimization of HVAC systems. Traditionally, optimization in HVAC systems has relied on offline optimization requiring domain expertise to schedule a set of optimal controls, but such approaches often require extensive domain expertise. Another drawback is that as the system changes over time, these controls go out of tune and need to be adapted. Thus, data-driven optimization approaches (such as reinforcement learning) appear more appealing due to their ability to adapt online and model and solve complex problems.
A primary objective of the thesis is to carry out exploratory data analysis experiments to quantify the savings potential and expose the optimization space for RL. The main objective is to explore distributed reinforcement learning via multi-agent RL strategies (MARL) and compare and contrast the pros and cons of MARL with single-agent RL. The work benchmarks two of the popular contemporary MARL strategies, centralized training and decentralized execution, and value-mixing approaches, along with proposing two novel MARL enhancements in HVAC systems: a linear value-mixing strategy (inspired by Q-function mixing, QMIX) and turnbased games, that attempt to alleviate some of the problems of multi-agent credit assignment and non-stationarity surrounding MARL.
The experimental results include the learning performance of various RL strategies and the performance benchmarks against the closed-loop controller under realistic conditions. The experimental results reveal that the RL strategies perform significantly better than the closed-loop controller (with a few exceptions), achieving power savings of up to 15% on yearly simulations with live weather profiles. The results also highlight the tradeoffs between optimality and sampling efficiency, further corroborating the prejudice about MARL, where the single-agent RL performs better in terms of optimality, while the MARL approach displays faster learning. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9174186
- author
- Javeed, Arshad
- supervisor
- organization
- year
- 2024
- type
- H3 - Professional qualifications (4 Years - )
- subject
- report number
- TFRT-6242
- other publication id
- 0280-5316
- language
- English
- id
- 9174186
- date added to LUP
- 2024-09-13 13:19:54
- date last changed
- 2024-09-13 13:19:54
@misc{9174186, abstract = {{Heating, ventilation, and air-conditioning (HVAC) systems are ubiquitous and are one of the major components responsible for energy consumption in a typical building system. In light of the imminent energy crisis, there is an increasing demand to revisit the building systems and save as much energy as possible. In this regard, the scope of the thesis is to explore opportunities for data-driven optimization of HVAC systems. Traditionally, optimization in HVAC systems has relied on offline optimization requiring domain expertise to schedule a set of optimal controls, but such approaches often require extensive domain expertise. Another drawback is that as the system changes over time, these controls go out of tune and need to be adapted. Thus, data-driven optimization approaches (such as reinforcement learning) appear more appealing due to their ability to adapt online and model and solve complex problems. A primary objective of the thesis is to carry out exploratory data analysis experiments to quantify the savings potential and expose the optimization space for RL. The main objective is to explore distributed reinforcement learning via multi-agent RL strategies (MARL) and compare and contrast the pros and cons of MARL with single-agent RL. The work benchmarks two of the popular contemporary MARL strategies, centralized training and decentralized execution, and value-mixing approaches, along with proposing two novel MARL enhancements in HVAC systems: a linear value-mixing strategy (inspired by Q-function mixing, QMIX) and turnbased games, that attempt to alleviate some of the problems of multi-agent credit assignment and non-stationarity surrounding MARL. The experimental results include the learning performance of various RL strategies and the performance benchmarks against the closed-loop controller under realistic conditions. The experimental results reveal that the RL strategies perform significantly better than the closed-loop controller (with a few exceptions), achieving power savings of up to 15% on yearly simulations with live weather profiles. The results also highlight the tradeoffs between optimality and sampling efficiency, further corroborating the prejudice about MARL, where the single-agent RL performs better in terms of optimality, while the MARL approach displays faster learning.}}, author = {{Javeed, Arshad}}, language = {{eng}}, note = {{Student Paper}}, title = {{Distributed Reinforcement Learning for Building Energy Optimization}}, year = {{2024}}, }