Building Data Classification and Association

Åberg, Anna; Sjölander, Christine

Building Data Classification and Association

Mark

Åberg, Anna and Sjölander, Christine (2018)
Department of Automatic Control

Abstract: Almost half of the energy consumption in the EU originates from heating and cooling of buildings. The European Commission states that smart control of building systems may reduce the energy consumption. Cloud based smart control or even advanced fault-detection systems are becoming more common and should work for any building in the world. These systems need to receive data from the physical buildings which are commonly managed by a Building Management System, BMS. Today, when connecting an advanced control or analysis system to a buildings’ BMS is a manual process, more or less, which is time consuming and error prone. Therefore, it would be beneficial if this process could be automated. This thesis aimed to find machine learning methods... (More); Almost half of the energy consumption in the EU originates from heating and cooling of buildings. The European Commission states that smart control of building systems may reduce the energy consumption. Cloud based smart control or even advanced fault-detection systems are becoming more common and should work for any building in the world. These systems need to receive data from the physical buildings which are commonly managed by a Building Management System, BMS. Today, when connecting an advanced control or analysis system to a buildings’ BMS is a manual process, more or less, which is time consuming and error prone. Therefore, it would be beneficial if this process could be automated. This thesis aimed to find machine learning methods that had the potential to be used to fullyor semi-automate the connection process.
By implementing and evaluating models of three machine learning methods, random forest, gradient boosting and neural network, we aimed to find some method able of labelling time series data into a fixed classification system with a precision of 80% or higher. The solutions were tested on three data sets with different complexity and we could show that for a set with low complexity it is possible to achieve perfect classification, i.e. accuracy of 100%. For the more complex sets accuracy decreased to roughly 60% and a fully automated solution from these models would not perform good enough. However, the probability that the correct class was among the top five predictions of the models remained high and therefore they could be used in a semi-automated connection process.
Overfitting was an extensive problem when classifying signals, especially for random forest and gradient boosting models. We believe this is partly due to the data being too homogeneous and the situation could be improved by including data from additional buildings. The problems with overfitting could be seen most clearly when models were trained and tested on data from different buildings. In this case, random forest and gradient boosting models were clearly outperformed by neural network models that still scored about 60% accuracy without any overfitting.
We also attempted to group signals by equipment type. This was done via support vector machines and a string comparison method. The support vector machine solution was only possible to deploy on the least complex data set, but performed well with an accuracy of over 85%. To implement this solution on more complex data sets more knowledge about the system is needed. The string comparison method proved that much information could be gathered from the correlations in the signal names and paths. Nevertheless, it was hard to come to any general conclusions from this since data from only one BMS was used.We believe that the string comparison could give good results in combination with other methods. (Less)

- Open Access
- |
- PDF

Links

Document download statistics

Related Materials

Related object is popular science:
Popular science summary

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8954429

author

Åberg, Anna and Sjölander, Christine

supervisor

organization

Department of Automatic Control

year

2018

type

H3 - Professional qualifications (4 Years - )

subject

Technology and Engineering

report number

TFRT-6057

ISSN

0280-5316

language

English

id

8954429

date added to LUP

2018-07-11 11:09:07

date last changed

2018-07-11 11:09:07

@misc{8954429,
abstract = {{Almost half of the energy consumption in the EU originates from heating and cooling of buildings. The European Commission states that smart control of building systems may reduce the energy consumption. Cloud based smart control or even advanced fault-detection systems are becoming more common and should work for any building in the world. These systems need to receive data from the physical buildings which are commonly managed by a Building Management System, BMS. Today, when connecting an advanced control or analysis system to a buildings’ BMS is a manual process, more or less, which is time consuming and error prone. Therefore, it would be beneficial if this process could be automated. This thesis aimed to find machine learning methods that had the potential to be used to fullyor semi-automate the connection process.
By implementing and evaluating models of three machine learning methods, random forest, gradient boosting and neural network, we aimed to find some method able of labelling time series data into a fixed classification system with a precision of 80% or higher. The solutions were tested on three data sets with different complexity and we could show that for a set with low complexity it is possible to achieve perfect classification, i.e. accuracy of 100%. For the more complex sets accuracy decreased to roughly 60% and a fully automated solution from these models would not perform good enough. However, the probability that the correct class was among the top five predictions of the models remained high and therefore they could be used in a semi-automated connection process.
Overfitting was an extensive problem when classifying signals, especially for random forest and gradient boosting models. We believe this is partly due to the data being too homogeneous and the situation could be improved by including data from additional buildings. The problems with overfitting could be seen most clearly when models were trained and tested on data from different buildings. In this case, random forest and gradient boosting models were clearly outperformed by neural network models that still scored about 60% accuracy without any overfitting.
We also attempted to group signals by equipment type. This was done via support vector machines and a string comparison method. The support vector machine solution was only possible to deploy on the least complex data set, but performed well with an accuracy of over 85%. To implement this solution on more complex data sets more knowledge about the system is needed. The string comparison method proved that much information could be gathered from the correlations in the signal names and paths. Nevertheless, it was hard to come to any general conclusions from this since data from only one BMS was used.We believe that the string comparison could give good results in combination with other methods.}},
author = {{Åberg, Anna and Sjölander, Christine}},
issn = {{0280-5316}},
language = {{eng}},
note = {{Student Paper}},
title = {{Building Data Classification and Association}},
year = {{2018}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Building Data Classification and Association