Exploring ML testing in practice - Lessons learned from an interactive rapid review with Axis Communications

Song, Qunying; Borg, Markus; Engström, Emelie; Ardö, Håkan; Rico, Sergio

Exploring ML testing in practice - Lessons learned from an interactive rapid review with Axis Communications

Mark

Song, Qunying ^LU

; Borg, Markus ^LU ; Engström, Emelie ^LU

; Ardö, Håkan and Rico, Sergio ^LU

(2022) 2022 IEEE/ACM 1st International Conference on AI Engineering – Software Engineering for AI (CAIN)

Abstract: There is a growing interest in industry and academia in machine learning (ML) testing. We believe that industry and academia need to learn together to produce rigorous and relevant knowledge. In this study, we initiate a collaboration between stakeholders from one case company, one research institute, and one university. To establish a common view of the problem domain, we applied an interactive rapid review of the state of the art. Four researchers from Lund University and RISE Research Institutes and four practitioners from Axis Communications reviewed a set of 180 primary studies on ML testing. We developed a taxonomy for the communication around ML testing challenges and results and identified a list of 12 review questions relevant for... (More); There is a growing interest in industry and academia in machine learning (ML) testing. We believe that industry and academia need to learn together to produce rigorous and relevant knowledge. In this study, we initiate a collaboration between stakeholders from one case company, one research institute, and one university. To establish a common view of the problem domain, we applied an interactive rapid review of the state of the art. Four researchers from Lund University and RISE Research Institutes and four practitioners from Axis Communications reviewed a set of 180 primary studies on ML testing. We developed a taxonomy for the communication around ML testing challenges and results and identified a list of 12 review questions relevant for Axis Communications. The three most important questions (data testing, metrics for assessment, and test generation) were mapped to the literature, and an in-depth analysis of the 35 primary studies matching the most important question (data testing) was made. A final set of the five best matches were analysed and we reflect on the criteria for applicability and relevance for the industry. The taxonomies are helpful for communication but not final. Furthermore, there was no perfect match to the case company’s investigated review question (data testing). However, we extracted relevant approaches from the five studies on a conceptual level to support later context-specific improvements. We found the interactive rapid review approach useful for triggering and aligning communication between the different stakeholders. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/1eb11df2-93f5-4998-91ba-497ae2fc1c44

author

Song, Qunying ^LU

; Borg, Markus ^LU ; Engström, Emelie ^LU

; Ardö, Håkan and Rico, Sergio ^LU

organization

publishing date

2022-05-16

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Software Engineering

keywords

AI Engineering, Machine Learning Testing, Interactive Rapid Review, Taxonomy

host publication

2022 IEEE/ACM 1st International Conference on AI Engineering – Software Engineering for AI (CAIN)

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

2022 IEEE/ACM 1st International Conference on AI Engineering – Software Engineering for AI (CAIN)

conference location

Pittsburg, United States

conference dates

2022-05-16 - 2022-05-17

external identifiers

scopus:85128924924

ISBN

978-1-6654-5206-9

978-1-4503-9275-4

project

WASP: Wallenberg AI, Autonomous Systems and Software Program at Lund University

Software testing of autonomous systems

language

English

LU publication?

yes

id

1eb11df2-93f5-4998-91ba-497ae2fc1c44

alternative location

https://ieeexplore.ieee.org/document/9796447

date added to LUP

2022-08-23 13:12:52

date last changed

2025-08-05 20:05:11

@inproceedings{1eb11df2-93f5-4998-91ba-497ae2fc1c44,
  abstract     = {{There is a growing interest in industry and academia in machine learning (ML) testing. We believe that industry and academia need to learn together to produce rigorous and relevant knowledge. In this study, we initiate a collaboration between stakeholders from one case company, one research institute, and one university. To establish a common view of the problem domain, we applied an interactive rapid review of the state of the art. Four researchers from Lund University and RISE Research Institutes and four practitioners from Axis Communications reviewed a set of 180 primary studies on ML testing. We developed a taxonomy for the communication around ML testing challenges and results and identified a list of 12 review questions relevant for Axis Communications. The three most important questions (data testing, metrics for assessment, and test generation) were mapped to the literature, and an in-depth analysis of the 35 primary studies matching the most important question (data testing) was made. A final set of the five best matches were analysed and we reflect on the criteria for applicability and relevance for the industry. The taxonomies are helpful for communication but not final. Furthermore, there was no perfect match to the case company’s investigated review question (data testing). However, we extracted relevant approaches from the five studies on a conceptual level to support later context-specific improvements. We found the interactive rapid review approach useful for triggering and aligning communication between the different stakeholders.}},
  author       = {{Song, Qunying and Borg, Markus and Engström, Emelie and Ardö, Håkan and Rico, Sergio}},
  booktitle    = {{2022 IEEE/ACM 1st International Conference on AI Engineering – Software Engineering for AI (CAIN)}},
  isbn         = {{978-1-6654-5206-9}},
  keywords     = {{AI Engineering; Machine Learning Testing; Interactive Rapid Review; Taxonomy}},
  language     = {{eng}},
  month        = {{05}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Exploring ML testing in practice - Lessons learned from an interactive rapid review with Axis Communications}},
  url          = {{https://lup.lub.lu.se/search/files/123042444/2203.16225.pdf}},
  year         = {{2022}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Exploring ML testing in practice - Lessons learned from an interactive rapid review with Axis Communications