Aggregating and utilizing sensor data

Vilensten, Maximillian; Hermansson, Oskar

Aggregating and utilizing sensor data

Mark

Vilensten, Maximillian ^LU and Hermansson, Oskar (2020) In CODEN:LUTEDX/TEIE EIEM01 20192
Division for Industrial Electrical Engineering and Automation

Abstract: The world of data is changing fast, while data always have been present in the world of computer science the importance of data has in the last couple of years grown, and it does not seem to be slowing down. The data usage is steadily increasing and the amount of data is growing, as well as the ever expanding amount of devices that produce data, especially in fields such at IoT and cloud computing. This has resulted in a need to make sure that ones data storage solution can effectively keep up without being too much of a hassle to maintain and upgrade. Data warehouses have for long been a popular choice for data storage partly due to its rigid structure. But in the world of big data, this structure can make it hard to adapt and add new... (More); The world of data is changing fast, while data always have been present in the world of computer science the importance of data has in the last couple of years grown, and it does not seem to be slowing down. The data usage is steadily increasing and the amount of data is growing, as well as the ever expanding amount of devices that produce data, especially in fields such at IoT and cloud computing. This has resulted in a need to make sure that ones data storage solution can effectively keep up without being too much of a hassle to maintain and upgrade. Data warehouses have for long been a popular choice for data storage partly due to its rigid structure. But in the world of big data, this structure can make it hard to adapt and add new types of data to the storage. A data lake can be a viable data storage solution if one seeks to achieve storage of many types of data from multiple sources since its open structure enables the user to practically store anything. This open structure however comes at a cost since the user now has to manage the raw data instead of the traditionally processed data of a data warehouse. Managing raw and unprocessed data is challenging as the purpose of the data may not be determined when the data is stored. This often leads to all data stored in a data lake to remain in the data lake without any form of structure nor purpose which quickly leads to the data lake turning into a data swamp. This thesis seeks to create a data lake solution that can store data from multiple sources and from different sensors while also making it very easy to add or remove data sources but also being resilient if the data format changes from its original form. While at the same time try to keep the infamous data swamp away. This involves sending the data from the sensors to a storage solution and then making sure that the data stored can be utilized by the data lakes owner as well as a third party. (Less)
Popular Abstract: The amount of data is steadily increasing and with it comes data from new sources and new formats. The ever changing state of data makes storing data challenging as it most storage options only takes data on a format specified when storing it. This thesis suggests a solution to this for a specific use case.

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9023492

author

Vilensten, Maximillian ^LU and Hermansson, Oskar

supervisor

Mats Lilja ^LU
Christian Nyberg ^LU

organization

Division for Industrial Electrical Engineering and Automation

course

EIEM01 20192

year

2020

type

H3 - Professional qualifications (4 Years - )

subject

Technology and Engineering

keywords

IoT, Data lake, Axis, Message broker, Cloud, Microsoft Azure, AWS, IBM cloud, Apache Kafka, Apache Pulsar

publication/series

CODEN:LUTEDX/TEIE

report number

5446

language

English

id

9023492

date added to LUP

2021-04-27 14:13:20

date last changed

2021-04-27 14:13:20

@misc{9023492,
  abstract     = {{The world of data is changing fast, while data always have been present in the world of computer science the importance of data has in the last couple of years grown, and it does not seem to be slowing down. The data usage is steadily increasing and the amount of data is growing, as well as the ever expanding amount of devices that produce data, especially in fields such at IoT and cloud computing. This has resulted in a need to make sure that ones data storage solution can effectively keep up without being too much of a hassle to maintain and upgrade. Data warehouses have for long been a popular choice for data storage partly due to its rigid structure. But in the world of big data, this structure can make it hard to adapt and add new types of data to the storage. A data lake can be a viable data storage solution if one seeks to achieve storage of many types of data from multiple sources since its open structure enables the user to practically store anything. This open structure however comes at a cost since the user now has to manage the raw data instead of the traditionally processed data of a data warehouse. Managing raw and unprocessed data is challenging as the purpose of the data may not be determined when the data is stored. This often leads to all data stored in a data lake to remain in the data lake without any form of structure nor purpose which quickly leads to the data lake turning into a data swamp. This thesis seeks to create a data lake solution that can store data from multiple sources and from different sensors while also making it very easy to add or remove data sources but also being resilient if the data format changes from its original form. While at the same time try to keep the infamous data swamp away. This involves sending the data from the sensors to a storage solution and then making sure that the data stored can be utilized by the data lakes owner as well as a third party.}},
  author       = {{Vilensten, Maximillian and Hermansson, Oskar}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{CODEN:LUTEDX/TEIE}},
  title        = {{Aggregating and utilizing sensor data}},
  year         = {{2020}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Aggregating and utilizing sensor data