Robust estimations of fault content with capture-recapture and detection profile estimators

Thelin, Thomas; Runeson, Per

Robust estimations of fault content with capture-recapture and detection profile estimators

Mark

Thelin, Thomas ^LU and Runeson, Per ^LU

(2000) In Journal of Systems and Software 52(2). p.139-148

Abstract: Inspections are widely used in the software engineering community as efficient contributors to reduced fault content and improved product understanding. In order to measure and control the effect and use of inspections, the fault content after an inspection must be estimated. The capture-recapture method, with its origin in biological sciences, is a promising approach for estimation of the remaining fault content in software artefacts. However, a number of empirical studies show that the estimates are neither accurate nor robust. In order to find robust estimates, i.e., estimates with small bias and variations, the adherence to the prerequisites for different estimation models is investigated. The basic hypothesis is that a model should... (More); Inspections are widely used in the software engineering community as efficient contributors to reduced fault content and improved product understanding. In order to measure and control the effect and use of inspections, the fault content after an inspection must be estimated. The capture-recapture method, with its origin in biological sciences, is a promising approach for estimation of the remaining fault content in software artefacts. However, a number of empirical studies show that the estimates are neither accurate nor robust. In order to find robust estimates, i.e., estimates with small bias and variations, the adherence to the prerequisites for different estimation models is investigated. The basic hypothesis is that a model should provide better estimates the closer the actual sample distribution is to the model's theoretical distribution. Firstly, a distance measure is evaluated and secondly a X²-based procedure is applied. Thirdly, smoothing algorithms are tried out, e.g., mean and median values of the estimates from a number of estimation models. Based on two different inspection experiments, we conclude that it is not possible to show a correlation between adherence to the model's theoretical distributions and the prediction capabilities of the models. This indicates that there are other factors that affect the estimation capabilities more than the prerequisites. Neither does the investigation point out any specific model to be superior. On the contrary, the Mh-JK model, which has been shown as the best alternative in a prior study, is inferior in this study. The most robust estimations are achieved by the smoothing algorithms.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/ae77ed4b-8a19-4b17-b457-95fb7918ad9a

author

Thelin, Thomas ^LU and Runeson, Per ^LU

organization

publishing date

2000-06-01

type

Contribution to journal

publication status

published

subject

Software Engineering

in

Journal of Systems and Software

volume

52

issue

2

pages

10 pages

publisher

Elsevier

external identifiers

scopus:0034207911

ISSN

0164-1212

DOI

10.1016/S0164-1212(99)00140-5

language

English

LU publication?

yes

additional info

Funding Information: We would like to thank Claes Wohlin and Håkan Petersson for their valuable comments on this paper. This work was partly funded by The Swedish National Board for Industrial and Technical Development (NUTEK), grant 1K1P-97-09673.

id

ae77ed4b-8a19-4b17-b457-95fb7918ad9a

date added to LUP

2022-10-19 11:45:52

date last changed

2025-10-14 13:26:55

@article{ae77ed4b-8a19-4b17-b457-95fb7918ad9a,
  abstract     = {{<p>Inspections are widely used in the software engineering community as efficient contributors to reduced fault content and improved product understanding. In order to measure and control the effect and use of inspections, the fault content after an inspection must be estimated. The capture-recapture method, with its origin in biological sciences, is a promising approach for estimation of the remaining fault content in software artefacts. However, a number of empirical studies show that the estimates are neither accurate nor robust. In order to find robust estimates, i.e., estimates with small bias and variations, the adherence to the prerequisites for different estimation models is investigated. The basic hypothesis is that a model should provide better estimates the closer the actual sample distribution is to the model's theoretical distribution. Firstly, a distance measure is evaluated and secondly a X<sup>2</sup>-based procedure is applied. Thirdly, smoothing algorithms are tried out, e.g., mean and median values of the estimates from a number of estimation models. Based on two different inspection experiments, we conclude that it is not possible to show a correlation between adherence to the model's theoretical distributions and the prediction capabilities of the models. This indicates that there are other factors that affect the estimation capabilities more than the prerequisites. Neither does the investigation point out any specific model to be superior. On the contrary, the Mh-JK model, which has been shown as the best alternative in a prior study, is inferior in this study. The most robust estimations are achieved by the smoothing algorithms.</p>}},
  author       = {{Thelin, Thomas and Runeson, Per}},
  issn         = {{0164-1212}},
  language     = {{eng}},
  month        = {{06}},
  number       = {{2}},
  pages        = {{139--148}},
  publisher    = {{Elsevier}},
  series       = {{Journal of Systems and Software}},
  title        = {{Robust estimations of fault content with capture-recapture and detection profile estimators}},
  url          = {{http://dx.doi.org/10.1016/S0164-1212(99)00140-5}},
  doi          = {{10.1016/S0164-1212(99)00140-5}},
  volume       = {{52}},
  year         = {{2000}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Robust estimations of fault content with capture-recapture and detection profile estimators