Towards optimization of anomaly detection in DevOps
(2023) In Information and Software Technology 160.- Abstract
Context: DevOps has recently become a mainstream solution for bridging the gaps between development (Dev) and operations (Ops) enabling cross-functional collaboration. The DevOps concept of continuous monitoring may bring a lot of benefits to development teams such as early detection of run-time errors and various performance anomalies. Objective: We aim to explore deep learning (DL) solutions for detection of anomalous systems behavior based on collected monitoring data that consists of applications’ and systems’ performance metrics. Moreover, we specifically address a shortage of approaches for evaluating DL models without any ground truth data. Methods: We perform a case study in a real DevOps environment, following the principles of... (More)
Context: DevOps has recently become a mainstream solution for bridging the gaps between development (Dev) and operations (Ops) enabling cross-functional collaboration. The DevOps concept of continuous monitoring may bring a lot of benefits to development teams such as early detection of run-time errors and various performance anomalies. Objective: We aim to explore deep learning (DL) solutions for detection of anomalous systems behavior based on collected monitoring data that consists of applications’ and systems’ performance metrics. Moreover, we specifically address a shortage of approaches for evaluating DL models without any ground truth data. Methods: We perform a case study in a real DevOps environment, following the principles of the design science paradigm. The research activities span from practice to theory and from problem to solution domain, including problem conceptualization, solution design, instantiation, and empirical validation. Results: We proposed and implemented a cloud solution for DL model deployment and evaluation empowered by feedback from the development team. The labeled data generated through the feedback was used for evaluation of current and training of new DL models in several iterations. The overall results showed that reconstruction-based models such as autoencoders, are quite robust to any parameter modification and are among the preferred for anomaly detection in multivariate monitoring data. Conclusion: Leveraging raw monitoring data and DL-inspired solutions, DevOps teams may get critical insights into the software and its operation. In our case, this proved to be an efficient way of discovering early signs of production failures.
(Less)
- author
- Hrusto, Adha LU ; Engström, Emelie LU and Runeson, Per LU
- organization
- publishing date
- 2023
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Anomaly detection, Deep learning, DevOps, Microservices
- in
- Information and Software Technology
- volume
- 160
- article number
- 107241
- publisher
- Elsevier
- external identifiers
-
- scopus:85154047506
- ISSN
- 0950-5849
- DOI
- 10.1016/j.infsof.2023.107241
- language
- English
- LU publication?
- yes
- additional info
- Publisher Copyright: © 2023 The Author(s)
- id
- f0122099-31a2-49cb-8d0c-a17c87e76801
- date added to LUP
- 2023-05-08 07:59:41
- date last changed
- 2023-11-22 18:05:51
@article{f0122099-31a2-49cb-8d0c-a17c87e76801, abstract = {{<p>Context: DevOps has recently become a mainstream solution for bridging the gaps between development (Dev) and operations (Ops) enabling cross-functional collaboration. The DevOps concept of continuous monitoring may bring a lot of benefits to development teams such as early detection of run-time errors and various performance anomalies. Objective: We aim to explore deep learning (DL) solutions for detection of anomalous systems behavior based on collected monitoring data that consists of applications’ and systems’ performance metrics. Moreover, we specifically address a shortage of approaches for evaluating DL models without any ground truth data. Methods: We perform a case study in a real DevOps environment, following the principles of the design science paradigm. The research activities span from practice to theory and from problem to solution domain, including problem conceptualization, solution design, instantiation, and empirical validation. Results: We proposed and implemented a cloud solution for DL model deployment and evaluation empowered by feedback from the development team. The labeled data generated through the feedback was used for evaluation of current and training of new DL models in several iterations. The overall results showed that reconstruction-based models such as autoencoders, are quite robust to any parameter modification and are among the preferred for anomaly detection in multivariate monitoring data. Conclusion: Leveraging raw monitoring data and DL-inspired solutions, DevOps teams may get critical insights into the software and its operation. In our case, this proved to be an efficient way of discovering early signs of production failures.</p>}}, author = {{Hrusto, Adha and Engström, Emelie and Runeson, Per}}, issn = {{0950-5849}}, keywords = {{Anomaly detection; Deep learning; DevOps; Microservices}}, language = {{eng}}, publisher = {{Elsevier}}, series = {{Information and Software Technology}}, title = {{Towards optimization of anomaly detection in DevOps}}, url = {{http://dx.doi.org/10.1016/j.infsof.2023.107241}}, doi = {{10.1016/j.infsof.2023.107241}}, volume = {{160}}, year = {{2023}}, }