Considering rigor and relevance when evaluating test driven development: A systematic review

Munir, Hussan; Moayyed, Misagh; Petersen, Kai

Considering rigor and relevance when evaluating test driven development: A systematic review

Mark

Munir, Hussan ^LU ; Moayyed, Misagh and Petersen, Kai (2014) In Information and Software Technology 56(4). p.375-394

Abstract: Context: Test driven development (TDD) has been extensively researched and compared to traditional approaches (test last development, TLD). Existing literature reviews show varying results for TDD. Objective: This study investigates how the conclusions of existing literature reviews change when taking two study quality dimension into account, namely rigor and relevance. Method: In this study a systematic literature review has been conducted and the results of the identified primary studies have been analyzed with respect to rigor and relevance scores using the assessment rubric proposed by Ivarsson and Gorschek 2011. Rigor and relevance are rated on a scale, which is explained in this paper. Four categories of studies were defined based on... (More); Context: Test driven development (TDD) has been extensively researched and compared to traditional approaches (test last development, TLD). Existing literature reviews show varying results for TDD. Objective: This study investigates how the conclusions of existing literature reviews change when taking two study quality dimension into account, namely rigor and relevance. Method: In this study a systematic literature review has been conducted and the results of the identified primary studies have been analyzed with respect to rigor and relevance scores using the assessment rubric proposed by Ivarsson and Gorschek 2011. Rigor and relevance are rated on a scale, which is explained in this paper. Four categories of studies were defined based on high/low rigor and relevance. Results: We found that studies in the four categories come to different conclusions. In particular, studies with a high rigor and relevance scores show clear results for improvement in external quality, which seem to come with a loss of productivity. At the same time high rigor and relevance studies only investigate a small set of variables. Other categories contain many studies showing no difference, hence biasing the results negatively for the overall set of primary studies. Given the classification differences to previous literature reviews could be highlighted. Conclusion: Strong indications are obtained that external quality is positively influenced, which has to be further substantiated by industry experiments and longitudinal case studies. Future studies in the high rigor and relevance category would contribute largely by focusing on a wider set of outcome variables (e.g. internal code quality). We also conclude that considering rigor and relevance in TDD evaluation is important given the differences in results between categories and in comparison to previous reviews. (C) 2014 Elsevier B.V. All rights reserved. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/4410922

author

Munir, Hussan ^LU ; Moayyed, Misagh and Petersen, Kai

organization

publishing date

2014

type

Contribution to journal

publication status

published

subject

Computer Sciences

keywords

Test-driven development (TDD), Test-last development (TLD), Internal, code quality, External code quality, Productivity

in

Information and Software Technology

volume

56

issue

4

pages

375 - 394

publisher

Elsevier

external identifiers

wos:000332904000001
scopus:84894046936

ISSN

0950-5849

DOI

10.1016/j.infsof.2014.01.002

language

English

LU publication?

yes

id

63eb9aec-1c81-4e38-a85a-d6c6a8d4e8f0 (old id 4410922)

date added to LUP

2016-04-01 14:41:12

date last changed

2025-10-14 11:13:07

@article{63eb9aec-1c81-4e38-a85a-d6c6a8d4e8f0,
  abstract     = {{Context: Test driven development (TDD) has been extensively researched and compared to traditional approaches (test last development, TLD). Existing literature reviews show varying results for TDD. Objective: This study investigates how the conclusions of existing literature reviews change when taking two study quality dimension into account, namely rigor and relevance. Method: In this study a systematic literature review has been conducted and the results of the identified primary studies have been analyzed with respect to rigor and relevance scores using the assessment rubric proposed by Ivarsson and Gorschek 2011. Rigor and relevance are rated on a scale, which is explained in this paper. Four categories of studies were defined based on high/low rigor and relevance. Results: We found that studies in the four categories come to different conclusions. In particular, studies with a high rigor and relevance scores show clear results for improvement in external quality, which seem to come with a loss of productivity. At the same time high rigor and relevance studies only investigate a small set of variables. Other categories contain many studies showing no difference, hence biasing the results negatively for the overall set of primary studies. Given the classification differences to previous literature reviews could be highlighted. Conclusion: Strong indications are obtained that external quality is positively influenced, which has to be further substantiated by industry experiments and longitudinal case studies. Future studies in the high rigor and relevance category would contribute largely by focusing on a wider set of outcome variables (e.g. internal code quality). We also conclude that considering rigor and relevance in TDD evaluation is important given the differences in results between categories and in comparison to previous reviews. (C) 2014 Elsevier B.V. All rights reserved.}},
  author       = {{Munir, Hussan and Moayyed, Misagh and Petersen, Kai}},
  issn         = {{0950-5849}},
  keywords     = {{Test-driven development (TDD); Test-last development (TLD); Internal; code quality; External code quality; Productivity}},
  language     = {{eng}},
  number       = {{4}},
  pages        = {{375--394}},
  publisher    = {{Elsevier}},
  series       = {{Information and Software Technology}},
  title        = {{Considering rigor and relevance when evaluating test driven development: A systematic review}},
  url          = {{http://dx.doi.org/10.1016/j.infsof.2014.01.002}},
  doi          = {{10.1016/j.infsof.2014.01.002}},
  volume       = {{56}},
  year         = {{2014}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Considering rigor and relevance when evaluating test driven development: A systematic review