Robust estimations of fault content with capture-recapture and detection profile estimators
(2000) In Journal of Systems and Software 52(2). p.139-148- Abstract
Inspections are widely used in the software engineering community as efficient contributors to reduced fault content and improved product understanding. In order to measure and control the effect and use of inspections, the fault content after an inspection must be estimated. The capture-recapture method, with its origin in biological sciences, is a promising approach for estimation of the remaining fault content in software artefacts. However, a number of empirical studies show that the estimates are neither accurate nor robust. In order to find robust estimates, i.e., estimates with small bias and variations, the adherence to the prerequisites for different estimation models is investigated. The basic hypothesis is that a model should... (More)
Inspections are widely used in the software engineering community as efficient contributors to reduced fault content and improved product understanding. In order to measure and control the effect and use of inspections, the fault content after an inspection must be estimated. The capture-recapture method, with its origin in biological sciences, is a promising approach for estimation of the remaining fault content in software artefacts. However, a number of empirical studies show that the estimates are neither accurate nor robust. In order to find robust estimates, i.e., estimates with small bias and variations, the adherence to the prerequisites for different estimation models is investigated. The basic hypothesis is that a model should provide better estimates the closer the actual sample distribution is to the model's theoretical distribution. Firstly, a distance measure is evaluated and secondly a X2-based procedure is applied. Thirdly, smoothing algorithms are tried out, e.g., mean and median values of the estimates from a number of estimation models. Based on two different inspection experiments, we conclude that it is not possible to show a correlation between adherence to the model's theoretical distributions and the prediction capabilities of the models. This indicates that there are other factors that affect the estimation capabilities more than the prerequisites. Neither does the investigation point out any specific model to be superior. On the contrary, the Mh-JK model, which has been shown as the best alternative in a prior study, is inferior in this study. The most robust estimations are achieved by the smoothing algorithms.
(Less)
- author
- Thelin, Thomas LU and Runeson, Per LU
- organization
- publishing date
- 2000-06-01
- type
- Contribution to journal
- publication status
- published
- subject
- in
- Journal of Systems and Software
- volume
- 52
- issue
- 2
- pages
- 10 pages
- publisher
- Elsevier
- external identifiers
-
- scopus:0034207911
- ISSN
- 0164-1212
- DOI
- 10.1016/S0164-1212(99)00140-5
- language
- English
- LU publication?
- yes
- additional info
- Funding Information: We would like to thank Claes Wohlin and HÃ¥kan Petersson for their valuable comments on this paper. This work was partly funded by The Swedish National Board for Industrial and Technical Development (NUTEK), grant 1K1P-97-09673.
- id
- ae77ed4b-8a19-4b17-b457-95fb7918ad9a
- date added to LUP
- 2022-10-19 11:45:52
- date last changed
- 2022-10-21 12:14:04
@article{ae77ed4b-8a19-4b17-b457-95fb7918ad9a, abstract = {{<p>Inspections are widely used in the software engineering community as efficient contributors to reduced fault content and improved product understanding. In order to measure and control the effect and use of inspections, the fault content after an inspection must be estimated. The capture-recapture method, with its origin in biological sciences, is a promising approach for estimation of the remaining fault content in software artefacts. However, a number of empirical studies show that the estimates are neither accurate nor robust. In order to find robust estimates, i.e., estimates with small bias and variations, the adherence to the prerequisites for different estimation models is investigated. The basic hypothesis is that a model should provide better estimates the closer the actual sample distribution is to the model's theoretical distribution. Firstly, a distance measure is evaluated and secondly a X<sup>2</sup>-based procedure is applied. Thirdly, smoothing algorithms are tried out, e.g., mean and median values of the estimates from a number of estimation models. Based on two different inspection experiments, we conclude that it is not possible to show a correlation between adherence to the model's theoretical distributions and the prediction capabilities of the models. This indicates that there are other factors that affect the estimation capabilities more than the prerequisites. Neither does the investigation point out any specific model to be superior. On the contrary, the Mh-JK model, which has been shown as the best alternative in a prior study, is inferior in this study. The most robust estimations are achieved by the smoothing algorithms.</p>}}, author = {{Thelin, Thomas and Runeson, Per}}, issn = {{0164-1212}}, language = {{eng}}, month = {{06}}, number = {{2}}, pages = {{139--148}}, publisher = {{Elsevier}}, series = {{Journal of Systems and Software}}, title = {{Robust estimations of fault content with capture-recapture and detection profile estimators}}, url = {{http://dx.doi.org/10.1016/S0164-1212(99)00140-5}}, doi = {{10.1016/S0164-1212(99)00140-5}}, volume = {{52}}, year = {{2000}}, }