Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Towards optimization of anomaly detection in DevOps

Hrusto, Adha LU orcid ; Engström, Emelie LU orcid and Runeson, Per LU orcid (2023) In Information and Software Technology 160.
Abstract

Context: DevOps has recently become a mainstream solution for bridging the gaps between development (Dev) and operations (Ops) enabling cross-functional collaboration. The DevOps concept of continuous monitoring may bring a lot of benefits to development teams such as early detection of run-time errors and various performance anomalies. Objective: We aim to explore deep learning (DL) solutions for detection of anomalous systems behavior based on collected monitoring data that consists of applications’ and systems’ performance metrics. Moreover, we specifically address a shortage of approaches for evaluating DL models without any ground truth data. Methods: We perform a case study in a real DevOps environment, following the principles of... (More)

Context: DevOps has recently become a mainstream solution for bridging the gaps between development (Dev) and operations (Ops) enabling cross-functional collaboration. The DevOps concept of continuous monitoring may bring a lot of benefits to development teams such as early detection of run-time errors and various performance anomalies. Objective: We aim to explore deep learning (DL) solutions for detection of anomalous systems behavior based on collected monitoring data that consists of applications’ and systems’ performance metrics. Moreover, we specifically address a shortage of approaches for evaluating DL models without any ground truth data. Methods: We perform a case study in a real DevOps environment, following the principles of the design science paradigm. The research activities span from practice to theory and from problem to solution domain, including problem conceptualization, solution design, instantiation, and empirical validation. Results: We proposed and implemented a cloud solution for DL model deployment and evaluation empowered by feedback from the development team. The labeled data generated through the feedback was used for evaluation of current and training of new DL models in several iterations. The overall results showed that reconstruction-based models such as autoencoders, are quite robust to any parameter modification and are among the preferred for anomaly detection in multivariate monitoring data. Conclusion: Leveraging raw monitoring data and DL-inspired solutions, DevOps teams may get critical insights into the software and its operation. In our case, this proved to be an efficient way of discovering early signs of production failures.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Anomaly detection, Deep learning, DevOps, Microservices
in
Information and Software Technology
volume
160
article number
107241
publisher
Elsevier
external identifiers
  • scopus:85154047506
ISSN
0950-5849
DOI
10.1016/j.infsof.2023.107241
language
English
LU publication?
yes
additional info
Publisher Copyright: © 2023 The Author(s)
id
f0122099-31a2-49cb-8d0c-a17c87e76801
date added to LUP
2023-05-08 07:59:41
date last changed
2023-11-22 18:05:51
@article{f0122099-31a2-49cb-8d0c-a17c87e76801,
  abstract     = {{<p>Context: DevOps has recently become a mainstream solution for bridging the gaps between development (Dev) and operations (Ops) enabling cross-functional collaboration. The DevOps concept of continuous monitoring may bring a lot of benefits to development teams such as early detection of run-time errors and various performance anomalies. Objective: We aim to explore deep learning (DL) solutions for detection of anomalous systems behavior based on collected monitoring data that consists of applications’ and systems’ performance metrics. Moreover, we specifically address a shortage of approaches for evaluating DL models without any ground truth data. Methods: We perform a case study in a real DevOps environment, following the principles of the design science paradigm. The research activities span from practice to theory and from problem to solution domain, including problem conceptualization, solution design, instantiation, and empirical validation. Results: We proposed and implemented a cloud solution for DL model deployment and evaluation empowered by feedback from the development team. The labeled data generated through the feedback was used for evaluation of current and training of new DL models in several iterations. The overall results showed that reconstruction-based models such as autoencoders, are quite robust to any parameter modification and are among the preferred for anomaly detection in multivariate monitoring data. Conclusion: Leveraging raw monitoring data and DL-inspired solutions, DevOps teams may get critical insights into the software and its operation. In our case, this proved to be an efficient way of discovering early signs of production failures.</p>}},
  author       = {{Hrusto, Adha and Engström, Emelie and Runeson, Per}},
  issn         = {{0950-5849}},
  keywords     = {{Anomaly detection; Deep learning; DevOps; Microservices}},
  language     = {{eng}},
  publisher    = {{Elsevier}},
  series       = {{Information and Software Technology}},
  title        = {{Towards optimization of anomaly detection in DevOps}},
  url          = {{http://dx.doi.org/10.1016/j.infsof.2023.107241}},
  doi          = {{10.1016/j.infsof.2023.107241}},
  volume       = {{160}},
  year         = {{2023}},
}