Do better IR tools improve the accuracy of engineers’ traceability recovery?

Borg, Markus; Pfahl, Dietmar

Do better IR tools improve the accuracy of engineers’ traceability recovery?

Mark

Borg, Markus ^LU and Pfahl, Dietmar ^LU (2011) MALETS 2011: International Workshop on Machine Learning Technologies in Software Engineering p.23-30

Abstract: Large-scale software development generates an ever-growing amount of information. Multiple research groups have proposed using approaches from the domain of information retrieval (IR) to recover traceability. Several enhancement strategies have been initially explored using the laboratory

model of IR evaluation for performance assessment. We conducted a pilot experiment using printed candidate lists from the tools RETRO and ReqSimile to investigate how different quality levels of tool output affect the tracing accuracy of engineers. Statistical testing of equivalence, commonly used in medicine, has been conducted to analyze the data. The low number of subjects in this pilot experiment resulted neither

in statistically... (More); Large-scale software development generates an ever-growing amount of information. Multiple research groups have proposed using approaches from the domain of information retrieval (IR) to recover traceability. Several enhancement strategies have been initially explored using the laboratory

model of IR evaluation for performance assessment. We conducted a pilot experiment using printed candidate lists from the tools RETRO and ReqSimile to investigate how different quality levels of tool output affect the tracing accuracy of engineers. Statistical testing of equivalence, commonly used in medicine, has been conducted to analyze the data. The low number of subjects in this pilot experiment resulted neither

in statistically significant equivalence nor difference. While our results are not conclusive, there are indications that it is worthwhile to investigate further into the actual value of improving tool support for semi-automatic traceability recovery. For example, our pilot experiment showed that the effect size of using RETRO versus ReqSimile is of practical

significance regarding precision and F-measure. The interpretation

of the effect size regarding recall is less clear. The experiment needs to be replicated with more subjects and on varying tasks to draw firm conclusions. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/2205432

author

Borg, Markus ^LU and Pfahl, Dietmar ^LU

organization

publishing date

2011

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer Sciences

keywords

requirements traceability, information retrieval, controlled experiment, equivalence testing

host publication

[Host publication title missing]

pages

8 pages

publisher

Association for Computing Machinery (ACM)

conference name

MALETS 2011: International Workshop on Machine Learning Technologies in Software Engineering

conference dates

2011-11-12

external identifiers

scopus:83255170948

DOI

10.1145/2070821.2070825

project

Embedded Applications Software Engineering

language

English

LU publication?

yes

id

efb32a51-53db-4c5a-a649-2c52975bb6f7 (old id 2205432)

date added to LUP

2016-04-04 11:53:52

date last changed

2025-10-14 09:55:58

@inproceedings{efb32a51-53db-4c5a-a649-2c52975bb6f7,
  abstract     = {{Large-scale software development generates an ever-growing amount of information. Multiple research groups have proposed using approaches from the domain of information retrieval (IR) to recover traceability. Several enhancement strategies have been initially explored using the laboratory<br/><br>
model of IR evaluation for performance assessment. We conducted a pilot experiment using printed candidate lists from the tools RETRO and ReqSimile to investigate how different quality levels of tool output affect the tracing accuracy of engineers. Statistical testing of equivalence, commonly used in medicine, has been conducted to analyze the data. The low number of subjects in this pilot experiment resulted neither<br/><br>
in statistically significant equivalence nor difference. While our results are not conclusive, there are indications that it is worthwhile to investigate further into the actual value of improving tool support for semi-automatic traceability recovery. For example, our pilot experiment showed that the effect size of using RETRO versus ReqSimile is of practical<br/><br>
significance regarding precision and F-measure. The interpretation<br/><br>
of the effect size regarding recall is less clear. The experiment needs to be replicated with more subjects and on varying tasks to draw firm conclusions.}},
  author       = {{Borg, Markus and Pfahl, Dietmar}},
  booktitle    = {{[Host publication title missing]}},
  keywords     = {{requirements traceability; information retrieval; controlled experiment; equivalence testing}},
  language     = {{eng}},
  pages        = {{23--30}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  title        = {{Do better IR tools improve the accuracy of engineers’ traceability recovery?}},
  url          = {{https://lup.lub.lu.se/search/files/5880698/2205440.pdf}},
  doi          = {{10.1145/2070821.2070825}},
  year         = {{2011}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Do better IR tools improve the accuracy of engineers’ traceability recovery?