Advanced

Advancing trace recovery evaluation: Applied information retrieval in a software engineering context

Borg, Markus LU (2012)
Abstract
Successful development of software systems involves efficient navigation among software artifacts. However, as artifacts continuously are produced and modified, engineers are typically plagued by challenging information landscapes. One state-of-practice approach to structure information is to establish trace links between artifacts, a practice that is also enforced by several development standards. Unfortunately, manually maintaining trace links in an evolving system is a tedious task. To tackle this issue, several researchers have proposed treating the capture and recovery of trace links as an Information Retrieval (IR) problem. The goal of this thesis is to contribute to the evaluation of IR-based trace recovery, both by presenting new... (More)
Successful development of software systems involves efficient navigation among software artifacts. However, as artifacts continuously are produced and modified, engineers are typically plagued by challenging information landscapes. One state-of-practice approach to structure information is to establish trace links between artifacts, a practice that is also enforced by several development standards. Unfortunately, manually maintaining trace links in an evolving system is a tedious task. To tackle this issue, several researchers have proposed treating the capture and recovery of trace links as an Information Retrieval (IR) problem. The goal of this thesis is to contribute to the evaluation of IR-based trace recovery, both by presenting new empirical results and by suggesting how to increase the strength of evidence in future evaluative studies.



This thesis is based on empirical software engineering research. The work contains a Systematic Literature Review (SLR) of previous evaluations of IR-based trace recovery. We show that a majority of previous evaluations have been technology-oriented, conducted in "the cave of IR evaluation", using small datasets as experimental input. Also, software artifacts originating from student projects have frequently been used in evaluations. We conducted a survey among traceability researchers, and found that a majority consider student artifacts to be only partly representative to industrial counterparts. Moreover, few traceability researchers had validated student artifacts for industrial representativeness before using them as experimental input. Our findings call for additional case studies to evaluate IR-based trace recovery within the full complexity of an industrial setting.



Also, this thesis contributes to the body of empirical evidence of IR-based trace recovery in two experiments with industrial software artifacts. The technology-oriented experiment highlights the clear dependence between datasets and the accuracy of IR-based trace recovery, in line with findings from the SLR. The human-oriented experiment investigates how different quality levels of tool output affect the tracing accuracy of engineers. While the results are not conclusive, there are indications that it is worthwhile to investigate further into the actual value of improving tool support for IR-based trace recovery. Finally, we present how tools and methods are evaluated in the general field of IR research, and propose a taxonomy of evaluation contexts tailored for IR-based trace recovery. (Less)
Please use this url to cite or link to this publication:
author
supervisor
organization
publishing date
type
Thesis
publication status
published
subject
keywords
information retrieval, traceability, software engineering
pages
175 pages
project
EASE
language
English
LU publication?
yes
id
96ae28fe-815d-46ba-8034-ad877f7954fd (old id 3098693)
date added to LUP
2012-09-25 11:48:53
date last changed
2016-09-19 08:45:00
@misc{96ae28fe-815d-46ba-8034-ad877f7954fd,
  abstract     = {Successful development of software systems involves efficient navigation among software artifacts. However, as artifacts continuously are produced and modified, engineers are typically plagued by challenging information landscapes. One state-of-practice approach to structure information is to establish trace links between artifacts, a practice that is also enforced by several development standards. Unfortunately, manually maintaining trace links in an evolving system is a tedious task. To tackle this issue, several researchers have proposed treating the capture and recovery of trace links as an Information Retrieval (IR) problem. The goal of this thesis is to contribute to the evaluation of IR-based trace recovery, both by presenting new empirical results and by suggesting how to increase the strength of evidence in future evaluative studies.<br/><br>
 <br/><br>
This thesis is based on empirical software engineering research. The work contains a Systematic Literature Review (SLR) of previous evaluations of IR-based trace recovery. We show that a majority of previous evaluations have been technology-oriented, conducted in "the cave of IR evaluation", using small datasets as experimental input. Also, software artifacts originating from student projects have frequently been used in evaluations. We conducted a survey among traceability researchers, and found that a majority consider student artifacts to be only partly representative to industrial counterparts. Moreover, few traceability researchers had validated student artifacts for industrial representativeness before using them as experimental input. Our findings call for additional case studies to evaluate IR-based trace recovery within the full complexity of an industrial setting.<br/><br>
<br/><br>
Also, this thesis contributes to the body of empirical evidence of IR-based trace recovery in two experiments with industrial software artifacts. The technology-oriented experiment highlights the clear dependence between datasets and the accuracy of IR-based trace recovery, in line with findings from the SLR. The human-oriented experiment investigates how different quality levels of tool output affect the tracing accuracy of engineers. While the results are not conclusive, there are indications that it is worthwhile to investigate further into the actual value of improving tool support for IR-based trace recovery. Finally, we present how tools and methods are evaluated in the general field of IR research, and propose a taxonomy of evaluation contexts tailored for IR-based trace recovery.},
  author       = {Borg, Markus},
  keyword      = {information retrieval,traceability,software engineering},
  language     = {eng},
  note         = {Licentiate Thesis},
  pages        = {175},
  title        = {Advancing trace recovery evaluation: Applied information retrieval in a software engineering context},
  year         = {2012},
}