Variation Factors in the Design and Analysis of Replicated Controlled Experiments - Three (Dis)similar Studies on Inspections versus Unit Testing

Runeson, Per; Stefik, Andreas; Andrews, Anneliese

Variation Factors in the Design and Analysis of Replicated Controlled Experiments - Three (Dis)similar Studies on Inspections versus Unit Testing

Mark

Runeson, Per ^LU

; Stefik, Andreas and Andrews, Anneliese (2014) In Empirical Software Engineering 19(6). p.1781-1808

Abstract: Background. In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance.

Aims. This paper aims to explore how context factors change in a series of for- mal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation.

Method. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection... (More); Background. In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance.

Aims. This paper aims to explore how context factors change in a series of for- mal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation.

Method. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection methods accordingly (reproduction). Experimental procedures and location also differ between the experiments.

Results. Contrary to expectations, there are significant differences between the original experiment and the replication, as well as compared to the reproduction. Some of the differences are due to factors other than the ones designed to vary between experiments, indicating the sensitivity to context factors in software engineering experimentation.

Conclusions. In aggregate, the analysis indicates that reducing the complexity of software engineering experiments should be considered by researchers who want to obtain reliable and repeatable empirical measures. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/3971750

author

Runeson, Per ^LU

; Stefik, Andreas and Andrews, Anneliese

organization

publishing date

2014

type

Contribution to journal

publication status

published

subject

Computer Sciences

keywords

formal experiments, replication, reproduction, experiment design, code inspection, unit testing

in

Empirical Software Engineering

volume

19

issue

6

pages

1781 - 1808

publisher

Springer

external identifiers

wos:000343910700007
scopus:84910016822

ISSN

1573-7616

DOI

10.1007/s10664-013-9262-z

language

English

LU publication?

yes

id

036af2ac-bb18-43a6-a44c-700acb979def (old id 3971750)

date added to LUP

2016-04-01 10:45:04

date last changed

2025-10-14 11:11:20

@article{036af2ac-bb18-43a6-a44c-700acb979def,
  abstract     = {{Background. In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance.<br/><br>
Aims. This paper aims to explore how context factors change in a series of for- mal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation. <br/><br>
Method. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection methods accordingly (reproduction). Experimental procedures and location also differ between the experiments. <br/><br>
Results. Contrary to expectations, there are significant differences between the original experiment and the replication, as well as compared to the reproduction. Some of the differences are due to factors other than the ones designed to vary between experiments, indicating the sensitivity to context factors in software engineering experimentation. <br/><br>
Conclusions. In aggregate, the analysis indicates that reducing the complexity of software engineering experiments should be considered by researchers who want to obtain reliable and repeatable empirical measures.}},
  author       = {{Runeson, Per and Stefik, Andreas and Andrews, Anneliese}},
  issn         = {{1573-7616}},
  keywords     = {{formal experiments; replication; reproduction; experiment design; code inspection; unit testing}},
  language     = {{eng}},
  number       = {{6}},
  pages        = {{1781--1808}},
  publisher    = {{Springer}},
  series       = {{Empirical Software Engineering}},
  title        = {{Variation Factors in the Design and Analysis of Replicated Controlled Experiments - Three (Dis)similar Studies on Inspections versus Unit Testing}},
  url          = {{https://lup.lub.lu.se/search/files/2103397/3971752.pdf}},
  doi          = {{10.1007/s10664-013-9262-z}},
  volume       = {{19}},
  year         = {{2014}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Variation Factors in the Design and Analysis of Replicated Controlled Experiments - Three (Dis)similar Studies on Inspections versus Unit Testing