Advanced

Variation Factors in the Design and Analysis of Replicated Controlled Experiments - Three (Dis)similar Studies on Inspections versus Unit Testing

Runeson, Per LU ; Stefik, Andreas and Andrews, Anneliese (2014) In Empirical Software Engineering 19(6). p.1781-1808
Abstract
Background. In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance.

Aims. This paper aims to explore how context factors change in a series of for- mal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation.

Method. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection... (More)
Background. In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance.

Aims. This paper aims to explore how context factors change in a series of for- mal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation.

Method. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection methods accordingly (reproduction). Experimental procedures and location also differ between the experiments.

Results. Contrary to expectations, there are significant differences between the original experiment and the replication, as well as compared to the reproduction. Some of the differences are due to factors other than the ones designed to vary between experiments, indicating the sensitivity to context factors in software engineering experimentation.

Conclusions. In aggregate, the analysis indicates that reducing the complexity of software engineering experiments should be considered by researchers who want to obtain reliable and repeatable empirical measures. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
formal experiments, replication, reproduction, experiment design, code inspection, unit testing
in
Empirical Software Engineering
volume
19
issue
6
pages
1781 - 1808
publisher
Springer
external identifiers
  • wos:000343910700007
  • scopus:84910016822
ISSN
1573-7616
DOI
10.1007/s10664-013-9262-z
language
English
LU publication?
yes
id
036af2ac-bb18-43a6-a44c-700acb979def (old id 3971750)
date added to LUP
2013-08-14 10:47:55
date last changed
2017-06-11 03:16:24
@article{036af2ac-bb18-43a6-a44c-700acb979def,
  abstract     = {Background. In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance.<br/><br>
Aims. This paper aims to explore how context factors change in a series of for- mal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation. <br/><br>
Method. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection methods accordingly (reproduction). Experimental procedures and location also differ between the experiments. <br/><br>
Results. Contrary to expectations, there are significant differences between the original experiment and the replication, as well as compared to the reproduction. Some of the differences are due to factors other than the ones designed to vary between experiments, indicating the sensitivity to context factors in software engineering experimentation. <br/><br>
Conclusions. In aggregate, the analysis indicates that reducing the complexity of software engineering experiments should be considered by researchers who want to obtain reliable and repeatable empirical measures.},
  author       = {Runeson, Per and Stefik, Andreas and Andrews, Anneliese},
  issn         = {1573-7616},
  keyword      = {formal experiments,replication,reproduction,experiment design,code inspection,unit testing},
  language     = {eng},
  number       = {6},
  pages        = {1781--1808},
  publisher    = {Springer},
  series       = {Empirical Software Engineering},
  title        = {Variation Factors in the Design and Analysis of Replicated Controlled Experiments - Three (Dis)similar Studies on Inspections versus Unit Testing},
  url          = {http://dx.doi.org/10.1007/s10664-013-9262-z},
  volume       = {19},
  year         = {2014},
}