Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Enhancing DevOps with Autonomous Monitors: A Proactive Approach to Failure Detection

Hrusto, Adha LU orcid (2024)
Abstract
Software engineering practices, including continuous integration, continuous testing, and continuous deployment, aim to streamline and automate the software development process. A cultural and professional movement that builds upon continuous practices, DevOps, seeks to bridge the gap between development and operations. By fostering a collaborative environment, DevOps supports faster, more frequent, and reliable software releases, inherently promoting agile methodologies throughout the software development lifecycle.

By introducing agility, there is a higher risk of operational failures in cloud-based software systems. Recognizing this challenge, the objective of this thesis is to understand and present approaches for mitigating... (More)
Software engineering practices, including continuous integration, continuous testing, and continuous deployment, aim to streamline and automate the software development process. A cultural and professional movement that builds upon continuous practices, DevOps, seeks to bridge the gap between development and operations. By fostering a collaborative environment, DevOps supports faster, more frequent, and reliable software releases, inherently promoting agile methodologies throughout the software development lifecycle.

By introducing agility, there is a higher risk of operational failures in cloud-based software systems. Recognizing this challenge, the objective of this thesis is to understand and present approaches for mitigating the cascading effects of operational failures across interconnected system components. In collaboration with two Swedish companies, we investigated how proactive monitoring strategies inspired by state-of-the-art machine learning (ML) solutions can prevent failure propagation and ensure seamless system operations.

The conducted research activities span from practice to theory and from problem to solution domain, including problem conceptualization, solution design, instantiation, and empirical validation. This complies with the main principles of the design science paradigm mainly used to frame problem-driven studies aiming to improve specific areas of practice.

The main contributions of this thesis are threefold. First, an in-depth overview of operational challenges and matching solutions in cloud-based software systems, focusing on alert management and monitoring data through two case studies and extensive literature reviews. Second, a proactive alert strategy called autonomous monitors to enhance early detection and prevention of operational failures. Finally, the practical applicability of these monitors is confirmed via empirical studies, highlighting their effectiveness in various industrial contexts.

We demonstrated the practical effectiveness of the proposed ML-based monitoring solution to pave the way for its widespread adoption for enhancing DevOps.
(Less)
Please use this url to cite or link to this publication:
author
supervisor
opponent
  • Prof. Mäntylä, Mika, Helsinki University, Finland.
organization
publishing date
type
Thesis
publication status
published
subject
pages
194 pages
publisher
Computer Science, Lund University
defense location
Lecture Hall E:A, building E, Klas Anshelms väg 10, Faculty of Engineering LTH, Lund University, Lund. The dissertation will be live streamed, but part of the premises is to be excluded from the live stream. Zoom: https://lu-se.zoom.us/j/68537887917
defense date
2024-11-08 09:15:00
ISBN
978-91-8104-210-8
978-91-8104-209-2
language
English
LU publication?
yes
id
16456ba3-7482-4846-956c-8cd4782fc557
date added to LUP
2024-10-01 14:14:22
date last changed
2024-10-18 09:17:10
@phdthesis{16456ba3-7482-4846-956c-8cd4782fc557,
  abstract     = {{Software engineering practices, including continuous integration, continuous testing, and continuous deployment, aim to streamline and automate the software development process. A cultural and professional movement that builds upon continuous practices, DevOps, seeks to bridge the gap between development and operations. By fostering a collaborative environment, DevOps supports faster, more frequent, and reliable software releases, inherently promoting agile methodologies throughout the software development lifecycle.<br/><br/>By introducing agility, there is a higher risk of operational failures in cloud-based software systems. Recognizing this challenge, the objective of this thesis is to understand and present approaches for mitigating the cascading effects of operational failures across interconnected system components. In collaboration with two Swedish companies, we investigated how proactive monitoring strategies inspired by state-of-the-art machine learning (ML) solutions can prevent failure propagation and ensure seamless system operations.<br/><br/>The conducted research activities span from practice to theory and from problem to solution domain, including problem conceptualization, solution design, instantiation, and empirical validation. This complies with the main principles of the design science paradigm mainly used to frame problem-driven studies aiming to improve specific areas of practice.  <br/><br/>The main contributions of this thesis are threefold. First, an in-depth overview of operational challenges and matching solutions in cloud-based software systems, focusing on alert management and monitoring data through two case studies and extensive literature reviews. Second, a proactive alert strategy called autonomous monitors to enhance early detection and prevention of operational failures. Finally, the practical applicability of these monitors is confirmed via empirical studies, highlighting their effectiveness in various industrial contexts.<br/><br/>We demonstrated the practical effectiveness of the proposed ML-based monitoring solution to pave the way for its widespread adoption for enhancing DevOps.<br/>}},
  author       = {{Hrusto, Adha}},
  isbn         = {{978-91-8104-210-8}},
  language     = {{eng}},
  month        = {{10}},
  publisher    = {{Computer Science, Lund University}},
  school       = {{Lund University}},
  title        = {{Enhancing DevOps with Autonomous Monitors: A Proactive Approach to Failure Detection}},
  url          = {{https://lup.lub.lu.se/search/files/196213572/PHD_Thesis_Adha_Hrusto.pdf}},
  year         = {{2024}},
}