Mutation Testing: Fewer, Faster, and Smarter

Vercammen, Sten

Mutation Testing: Fewer, Faster, and Smarter

Mark

Vercammen, Sten ^LU (2023)

Abstract: The growing reliance on automated software tests raises a fundamental question: How trustworthy are these automated tests? Today, mutation testing is acknowledged within academic circles as the most promising technique for assessing the fault-detection capability of a test suite. The technique deliberately injects faults (called mutants) into the production code and counts how many of them are caught by the test suite.

Mutation testing shines in systems with high statement coverage because uncaught mutants reveal weaknesses in code which is supposedly covered by tests. Safety-critical systems –where safety standards dictate high statement coverage - are therefore a prime candidate for mutation testing. In safety-critical software,... (More); The growing reliance on automated software tests raises a fundamental question: How trustworthy are these automated tests? Today, mutation testing is acknowledged within academic circles as the most promising technique for assessing the fault-detection capability of a test suite. The technique deliberately injects faults (called mutants) into the production code and counts how many of them are caught by the test suite.

Mutation testing shines in systems with high statement coverage because uncaught mutants reveal weaknesses in code which is supposedly covered by tests. Safety-critical systems –where safety standards dictate high statement coverage - are therefore a prime candidate for mutation testing. In safety-critical software, C and C++ dominate the technology stack. Yet in the mutation testing community, the C language family is somehow neglected: a systematic literature review on mutation testing from 2019 reports that less than 25% of the primary studies target source code from the C language family. Despite the apparent potential, mutation testing is difficult to adopt in industrial settings, because the technique -in its basic form- requires a tremendous amount of computing power. Without optimisations, the entire code base must be compiled and tested separately for each injected mutant. Hence for medium to large test suites, mutation testing without optimisations becomes prohibitively expensive.

To make mutation testing effective in an industrial setting, we set three objectives: (1) generate fewer mutants, (2) process them smarter and (3) execute them faster. To meet our objectives, we investigate the most promising techniques from the current state-of-the-art. This ranges from leveraging cloud technology to compiler integrated techniques using the Clang front-end. These optimisation strategies allow to eliminate the compilation and execution overhead in order to support efficient mutation testing for the C language family.

As a final step, we perform an empirical study on the perception of mutation testing in industry. The aim is to investigate whether the advances are sufficient to allow industrial adoption and to identify any remaining barriers preventing industrial adoption.

In this thesis, we show that a combination of mutation testing optimisation techniques from the do fewer, do faster, and do smarter are needed to perform mutation testing in a continuous integration setting. Furthermore, the industrial perception of mutation testing is evolving as additional organisations recognise its potential. (Less)
Abstract (Swedish): Ökat användande av testautomation föranleder en grundläggande fråga: Hur tillförlitliga är egentligen alla dessa automatiserade tester? Inom den akademiska forskningen anses mutationstestning vara den mest lovande tekniken för att bedöma en testsvits förmågas att upptäcka fel. Tekniken introducerar avsiktligt fel (så kallade mutanter) i produktionskoden och utvärderar hur många av felen som upptäcks av testsviten.

Mutationstestning är särskilt användbart för källkod med hög grad av kodtäckning. Detta beror på att mutanter som inte upptäcks avslöjar testfalls bristande förmåga att upptäcka fel. Saknas kodtäckning ﬁnns inte heller någon anledning att utvärdera testfallen. Säkerhetskritiska system - för vilka säkerhetsstandarder... (More); Ökat användande av testautomation föranleder en grundläggande fråga: Hur tillförlitliga är egentligen alla dessa automatiserade tester? Inom den akademiska forskningen anses mutationstestning vara den mest lovande tekniken för att bedöma en testsvits förmågas att upptäcka fel. Tekniken introducerar avsiktligt fel (så kallade mutanter) i produktionskoden och utvärderar hur många av felen som upptäcks av testsviten.

Mutationstestning är särskilt användbart för källkod med hög grad av kodtäckning. Detta beror på att mutanter som inte upptäcks avslöjar testfalls bristande förmåga att upptäcka fel. Saknas kodtäckning ﬁnns inte heller någon anledning att utvärdera testfallen. Säkerhetskritiska system - för vilka säkerhetsstandarder kräver hög grad av kodtäckning - är därför lämpliga kandidater för mutationstestning. I säkerhetskritisk mjukvara dominerar C och C++ teknikstacken. Detta återspeglas inte i mutationstestningsforskningen. En systematisk översiktsstudie från 2019, baserad på 502 artiklar, rapporter att bara 62 av 190 empiriska studier betraktar C-språksfamiljen. Vidare rapporterades att enbart 15 av 76 identiﬁerade mutationstestningsverktyg behandlar källkod från C-språksfamiljen. Trots den uppenbara potentialen har mutationstestning visat sig svårt att införa i industriella utvecklingsmiljöer, eftersom tekniken - i sin grundform - kräver en enorm mängd beräkningskraft. Utan optimeringar måste hela kodbasen kompileras och testas separat för varje introducerad mutant. Av denna anledning blir mutationstestning utan optimeringar i praktiken oanvändbart för medelstora till stora testssviter.

För att göra mutationstestning eﬀektivt i en industriell utvecklingsmiljö sätter vi tre mål: (1) generera färre mutanter, (2) bearbeta dem smartare och (3) exekvera dem snabbare. Vi undersöker de mest lovande teknikerna från forskningsfronten. Till exempel, molnteknologi och kompilatorbaserade tekniker som använder Clang-front-enden. Dessa optimeringsstrategier leder till eliminering av kompilerings- och exekveringsoverhead vilket möjliggör resurseﬀektiv mutationstestning för C-språksfamiljen.

Avslutningsvis genomför vi en empirisk studie av industriella perspektiv på mutationstestning. Syftet är att utvärdera om optimeringarna är tillräckliga för industriella kontexter samt att identiﬁera eventuellt återstående hinder för storskalig tillämpning. Våra resultat visar att industrins syn på mutationstestning har utvecklats efter hand som ﬂer utvecklingsorganisationer upptäckt möjligheterna med tekniken.

Avhandlingen demonstrerar att en kombination av optimeringarna är nödvändiga för att tillämpa mutationstestning i industriella continuous integration-kontexter. Avslutningsvis visar vi att industrins syn på mutationstestning utvecklas positivt efter hand som ﬂer organisationer värdesätter dess potential. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8fed7357-d537-4609-977c-3fe665b37bd8

author

Vercammen, Sten ^LU

supervisor

opponent

Prof. Offutt, Jeff, George Mason University, USA.

organization

Software Development and Environments

publishing date

2023

type

Thesis

publication status

published

subject

Software Engineering

pages

174 pages

publisher

Department of Computer Science, Lund University

defense location

Lecture Hall E:1406, building E, Ole Römers väg 3, Faculty of Engineering LTH, Lund University, Lund.

defense date

2023-03-14 13:15:00

ISBN

978-91-8039-576-2

978-91-8039-577-9

language

English

LU publication?

yes

id

8fed7357-d537-4609-977c-3fe665b37bd8

date added to LUP

2023-02-16 10:12:39

date last changed

2025-04-04 14:57:50

@phdthesis{8fed7357-d537-4609-977c-3fe665b37bd8,
  abstract     = {{The growing reliance on automated software tests raises a fundamental question: How trustworthy are these automated tests? Today, mutation testing is acknowledged within academic circles as the most promising technique for assessing the fault-detection capability of a test suite. The technique deliberately injects faults (called mutants) into the production code and counts how many of them are caught by the test suite.<br/><br/>Mutation testing shines in systems with high statement coverage because uncaught mutants reveal weaknesses in code which is supposedly covered by tests. Safety-critical systems –where safety standards dictate high statement coverage - are therefore a prime candidate for mutation testing. In safety-critical software, C and C++ dominate the technology stack. Yet in the mutation testing community, the C language family is somehow neglected: a systematic literature review on mutation testing from 2019 reports that less than 25% of the primary studies target source code from the C language family. Despite the apparent potential, mutation testing is difficult to adopt in industrial settings, because the technique -in its basic form- requires a tremendous amount of computing power. Without optimisations, the entire code base must be compiled and tested separately for each injected mutant. Hence for medium to large test suites, mutation testing without optimisations becomes prohibitively expensive.<br/><br/>To make mutation testing effective in an industrial setting, we set three objectives: (1) generate fewer mutants, (2) process them smarter and (3) execute them faster. To meet our objectives, we investigate the most promising techniques from the current state-of-the-art. This ranges from leveraging cloud technology to compiler integrated techniques using the Clang front-end. These optimisation strategies allow to eliminate the compilation and execution overhead in order to support efficient mutation testing for the C language family.<br/><br/>As a final step, we perform an empirical study on the perception of mutation testing in industry. The aim is to investigate whether the advances are sufficient to allow industrial adoption and to identify any remaining barriers preventing industrial adoption.<br/><br/>In this thesis, we show that a combination of mutation testing optimisation techniques from the do fewer, do faster, and do smarter are needed to perform mutation testing in a continuous integration setting. Furthermore, the industrial perception of mutation testing is evolving as additional organisations recognise its potential.}},
  author       = {{Vercammen, Sten}},
  isbn         = {{978-91-8039-576-2}},
  language     = {{eng}},
  publisher    = {{Department of Computer Science, Lund University}},
  school       = {{Lund University}},
  title        = {{Mutation Testing: Fewer, Faster, and Smarter}},
  year         = {{2023}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Mutation Testing: Fewer, Faster, and Smarter