Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Aggregating and utilizing sensor data

Vilensten, Maximillian LU and Hermansson, Oskar (2020) In CODEN:LUTEDX/TEIE EIEM01 20192
Industrial Electrical Engineering and Automation
Abstract
The world of data is changing fast, while data always have been present in the world of computer science the importance of data has in the last couple of years grown, and it does not seem to be slowing down. The data usage is steadily increasing and the amount of data is growing, as well as the ever expanding amount of devices that produce data, especially in fields such at IoT and cloud computing. This has resulted in a need to make sure that ones data storage solution can effectively keep up without being too much of a hassle to maintain and upgrade. Data warehouses have for long been a popular choice for data storage partly due to its rigid structure. But in the world of big data, this structure can make it hard to adapt and add new... (More)
The world of data is changing fast, while data always have been present in the world of computer science the importance of data has in the last couple of years grown, and it does not seem to be slowing down. The data usage is steadily increasing and the amount of data is growing, as well as the ever expanding amount of devices that produce data, especially in fields such at IoT and cloud computing. This has resulted in a need to make sure that ones data storage solution can effectively keep up without being too much of a hassle to maintain and upgrade. Data warehouses have for long been a popular choice for data storage partly due to its rigid structure. But in the world of big data, this structure can make it hard to adapt and add new types of data to the storage. A data lake can be a viable data storage solution if one seeks to achieve storage of many types of data from multiple sources since its open structure enables the user to practically store anything. This open structure however comes at a cost since the user now has to manage the raw data instead of the traditionally processed data of a data warehouse. Managing raw and unprocessed data is challenging as the purpose of the data may not be determined when the data is stored. This often leads to all data stored in a data lake to remain in the data lake without any form of structure nor purpose which quickly leads to the data lake turning into a data swamp. This thesis seeks to create a data lake solution that can store data from multiple sources and from different sensors while also making it very easy to add or remove data sources but also being resilient if the data format changes from its original form. While at the same time try to keep the infamous data swamp away. This involves sending the data from the sensors to a storage solution and then making sure that the data stored can be utilized by the data lakes owner as well as a third party. (Less)
Popular Abstract
The amount of data is steadily increasing and with it comes data from new sources and new formats. The ever changing state of data makes storing data challenging as it most storage options only takes data on a format specified when storing it. This thesis suggests a solution to this for a specific use case.
Please use this url to cite or link to this publication:
author
Vilensten, Maximillian LU and Hermansson, Oskar
supervisor
organization
course
EIEM01 20192
year
type
H3 - Professional qualifications (4 Years - )
subject
keywords
IoT, Data lake, Axis, Message broker, Cloud, Microsoft Azure, AWS, IBM cloud, Apache Kafka, Apache Pulsar
publication/series
CODEN:LUTEDX/TEIE
report number
5446
language
English
id
9023492
date added to LUP
2021-04-27 14:13:20
date last changed
2021-04-27 14:13:20
@misc{9023492,
  abstract     = {{The world of data is changing fast, while data always have been present in the world of computer science the importance of data has in the last couple of years grown, and it does not seem to be slowing down. The data usage is steadily increasing and the amount of data is growing, as well as the ever expanding amount of devices that produce data, especially in fields such at IoT and cloud computing. This has resulted in a need to make sure that ones data storage solution can effectively keep up without being too much of a hassle to maintain and upgrade. Data warehouses have for long been a popular choice for data storage partly due to its rigid structure. But in the world of big data, this structure can make it hard to adapt and add new types of data to the storage. A data lake can be a viable data storage solution if one seeks to achieve storage of many types of data from multiple sources since its open structure enables the user to practically store anything. This open structure however comes at a cost since the user now has to manage the raw data instead of the traditionally processed data of a data warehouse. Managing raw and unprocessed data is challenging as the purpose of the data may not be determined when the data is stored. This often leads to all data stored in a data lake to remain in the data lake without any form of structure nor purpose which quickly leads to the data lake turning into a data swamp. This thesis seeks to create a data lake solution that can store data from multiple sources and from different sensors while also making it very easy to add or remove data sources but also being resilient if the data format changes from its original form. While at the same time try to keep the infamous data swamp away. This involves sending the data from the sensors to a storage solution and then making sure that the data stored can be utilized by the data lakes owner as well as a third party.}},
  author       = {{Vilensten, Maximillian and Hermansson, Oskar}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{CODEN:LUTEDX/TEIE}},
  title        = {{Aggregating and utilizing sensor data}},
  year         = {{2020}},
}