Advanced

Do better IR tools improve the accuracy of engineers’ traceability recovery?

Borg, Markus LU and Pfahl, Dietmar LU (2011) MALETS 2011: International Workshop on Machine Learning Technologies in Software Engineering In [Host publication title missing] p.23-30
Abstract
Large-scale software development generates an ever-growing amount of information. Multiple research groups have proposed using approaches from the domain of information retrieval (IR) to recover traceability. Several enhancement strategies have been initially explored using the laboratory

model of IR evaluation for performance assessment. We conducted a pilot experiment using printed candidate lists from the tools RETRO and ReqSimile to investigate how different quality levels of tool output affect the tracing accuracy of engineers. Statistical testing of equivalence, commonly used in medicine, has been conducted to analyze the data. The low number of subjects in this pilot experiment resulted neither

in statistically... (More)
Large-scale software development generates an ever-growing amount of information. Multiple research groups have proposed using approaches from the domain of information retrieval (IR) to recover traceability. Several enhancement strategies have been initially explored using the laboratory

model of IR evaluation for performance assessment. We conducted a pilot experiment using printed candidate lists from the tools RETRO and ReqSimile to investigate how different quality levels of tool output affect the tracing accuracy of engineers. Statistical testing of equivalence, commonly used in medicine, has been conducted to analyze the data. The low number of subjects in this pilot experiment resulted neither

in statistically significant equivalence nor difference. While our results are not conclusive, there are indications that it is worthwhile to investigate further into the actual value of improving tool support for semi-automatic traceability recovery. For example, our pilot experiment showed that the effect size of using RETRO versus ReqSimile is of practical

significance regarding precision and F-measure. The interpretation

of the effect size regarding recall is less clear. The experiment needs to be replicated with more subjects and on varying tasks to draw firm conclusions. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
requirements traceability, information retrieval, controlled experiment, equivalence testing
in
[Host publication title missing]
pages
8 pages
publisher
ACM
conference name
MALETS 2011: International Workshop on Machine Learning Technologies in Software Engineering
external identifiers
  • scopus:83255170948
DOI
10.1145/2070821.2070825
project
EASE
language
English
LU publication?
yes
id
efb32a51-53db-4c5a-a649-2c52975bb6f7 (old id 2205432)
date added to LUP
2011-11-22 13:23:04
date last changed
2017-10-01 05:10:43
@inproceedings{efb32a51-53db-4c5a-a649-2c52975bb6f7,
  abstract     = {Large-scale software development generates an ever-growing amount of information. Multiple research groups have proposed using approaches from the domain of information retrieval (IR) to recover traceability. Several enhancement strategies have been initially explored using the laboratory<br/><br>
model of IR evaluation for performance assessment. We conducted a pilot experiment using printed candidate lists from the tools RETRO and ReqSimile to investigate how different quality levels of tool output affect the tracing accuracy of engineers. Statistical testing of equivalence, commonly used in medicine, has been conducted to analyze the data. The low number of subjects in this pilot experiment resulted neither<br/><br>
in statistically significant equivalence nor difference. While our results are not conclusive, there are indications that it is worthwhile to investigate further into the actual value of improving tool support for semi-automatic traceability recovery. For example, our pilot experiment showed that the effect size of using RETRO versus ReqSimile is of practical<br/><br>
significance regarding precision and F-measure. The interpretation<br/><br>
of the effect size regarding recall is less clear. The experiment needs to be replicated with more subjects and on varying tasks to draw firm conclusions.},
  author       = {Borg, Markus and Pfahl, Dietmar},
  booktitle    = {[Host publication title missing]},
  keyword      = {requirements traceability,information retrieval,controlled experiment,equivalence testing},
  language     = {eng},
  pages        = {23--30},
  publisher    = {ACM},
  title        = {Do better IR tools improve the accuracy of engineers’ traceability recovery?},
  url          = {http://dx.doi.org/10.1145/2070821.2070825},
  year         = {2011},
}