Getting Started with Chaos Engineering – design of an implementation framework in practice
(2020)- Abstract
- Background. Chaos Engineering is proposed as a practice to verify a system’s resilience under real, operational conditions. It employs fault injection, is originally developed at Netflix, and supported by several tools from there and other sources. Aims. We aim to intro- duce Chaos Engineering at ICA Gruppen AB, a group of companies whose core business is grocery retail, to improve their systems’ resilience, and to capture our knowledge gained from literature and interviews in a process framework for the introduction of Chaos Engineering. Method. The research is conducted under the design science paradigm, where the problem is conceptualized through a literature study of Chaos Engineering and exploratory interviews in the company. The... (More)
- Background. Chaos Engineering is proposed as a practice to verify a system’s resilience under real, operational conditions. It employs fault injection, is originally developed at Netflix, and supported by several tools from there and other sources. Aims. We aim to intro- duce Chaos Engineering at ICA Gruppen AB, a group of companies whose core business is grocery retail, to improve their systems’ resilience, and to capture our knowledge gained from literature and interviews in a process framework for the introduction of Chaos Engineering. Method. The research is conducted under the design science paradigm, where the problem is conceptualized through a literature study of Chaos Engineering and exploratory interviews in the company. The solution framework is designed based on the literature and a tool survey, and validated by letting software en- gineers at ICA apply parts of it to the software systems of ica.se website, including its e-shop. Results. The main contributions are a synthesis of Chaos Engineering literature and tools, in depth un- derstanding of the needs of the case company, and guidelines for introducing Chaos Engineering. Conclusions. The applied parts were concluded to be feasible and they successfully discovered a set of initial improvement opportunities for the system’s resilience, as well as a suitable Chaos Engineering practice for future resilience testing of the system. We recommend companies using the frame- work as a guide for the implementation of Chaos Engineering.
(Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/f5f14990-3b0e-43db-848a-66a44f01a841
- author
- Jernberg, Hugo ; Runeson, Per LU and Engström, Emelie LU
- organization
- publishing date
- 2020
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) Industry Track
- editor
- Cataldo, Marcelo and Ciolkowski, Marcus
- article number
- 43
- pages
- 10 pages
- publisher
- Association for Computing Machinery (ACM)
- external identifiers
-
- scopus:85095833015
- ISBN
- 978-1-4503-7580-1
- DOI
- 10.1145/3382494.3421464
- project
- WASP Software Engineering Cluster
- language
- English
- LU publication?
- yes
- id
- f5f14990-3b0e-43db-848a-66a44f01a841
- date added to LUP
- 2020-09-17 17:10:53
- date last changed
- 2022-05-12 06:42:49
@inproceedings{f5f14990-3b0e-43db-848a-66a44f01a841, abstract = {{Background. Chaos Engineering is proposed as a practice to verify a system’s resilience under real, operational conditions. It employs fault injection, is originally developed at Netflix, and supported by several tools from there and other sources. Aims. We aim to intro- duce Chaos Engineering at ICA Gruppen AB, a group of companies whose core business is grocery retail, to improve their systems’ resilience, and to capture our knowledge gained from literature and interviews in a process framework for the introduction of Chaos Engineering. Method. The research is conducted under the design science paradigm, where the problem is conceptualized through a literature study of Chaos Engineering and exploratory interviews in the company. The solution framework is designed based on the literature and a tool survey, and validated by letting software en- gineers at ICA apply parts of it to the software systems of ica.se website, including its e-shop. Results. The main contributions are a synthesis of Chaos Engineering literature and tools, in depth un- derstanding of the needs of the case company, and guidelines for introducing Chaos Engineering. Conclusions. The applied parts were concluded to be feasible and they successfully discovered a set of initial improvement opportunities for the system’s resilience, as well as a suitable Chaos Engineering practice for future resilience testing of the system. We recommend companies using the frame- work as a guide for the implementation of Chaos Engineering.<br/>}}, author = {{Jernberg, Hugo and Runeson, Per and Engström, Emelie}}, booktitle = {{Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) Industry Track}}, editor = {{Cataldo, Marcelo and Ciolkowski, Marcus}}, isbn = {{978-1-4503-7580-1}}, language = {{eng}}, publisher = {{Association for Computing Machinery (ACM)}}, title = {{Getting Started with Chaos Engineering – design of an implementation framework in practice}}, url = {{http://dx.doi.org/10.1145/3382494.3421464}}, doi = {{10.1145/3382494.3421464}}, year = {{2020}}, }