Advanced

Measuring Semantic Distances between Software Artifacts to Consolidate Issues from the Development and the Field

Nasser, Mahmoud LU (2017) In LU-CS-EX 2017-09 EDA920 20152
Department of Computer Science
Abstract
Identifying and keeping track of different structural representations of functionally overlapping issues is important in order to keep a well maintained issue management corpus, establishing efficient and organized response ability
to develop and code software patches repairing these issues and defects. This
is normally achieved by manual, time-costly reviewing-processes by special
teams put up to this task.

In this project we implement a tool using information retrieval technology,
that intends to help these teams make better and faster qualitative assessments
by providing quantitative indications in the form of similarity scores to other
artifacts within a given dataset.

This approach is inspired by a paper with a similar... (More)
Identifying and keeping track of different structural representations of functionally overlapping issues is important in order to keep a well maintained issue management corpus, establishing efficient and organized response ability
to develop and code software patches repairing these issues and defects. This
is normally achieved by manual, time-costly reviewing-processes by special
teams put up to this task.

In this project we implement a tool using information retrieval technology,
that intends to help these teams make better and faster qualitative assessments
by providing quantitative indications in the form of similarity scores to other
artifacts within a given dataset.

This approach is inspired by a paper with a similar goal, namely detecting
duplicate issue reports. That study found that 60 % of all marked duplicates
could be found with the corresponding implementation of this approach.
Achieving similar outcomes would contribute to improved and more effective
reviewing-processes.

We use the qualitative research method of informal interviews to define the
semantic distance metric to implement. In the evaluation we mainly use a
qualitative method to assess the accuracy of it, but verify our findings with a
quantitative method. We also investigate the scalability of the tool with quantitative
methods.

As a result of the limited scope of this thesis work, the tool in its current
state will have limited use in a live development environment. However, we
conclude that this approach has a development potential and could bring fruitful
findings in the issue management and issue maintenance field if developed
further upon. (Less)
Please use this url to cite or link to this publication:
author
Nasser, Mahmoud LU
supervisor
organization
course
EDA920 20152
year
type
H3 - Professional qualifications (4 Years - )
subject
keywords
information retrieval technology, semantic distances, issue management, issue maintenance, traceability link retrieval
publication/series
LU-CS-EX 2017-09
report number
LU-CS-EX 2017-09
ISSN
1650-2884
language
English
id
8916761
date added to LUP
2017-06-19 09:36:46
date last changed
2017-06-19 09:36:46
@misc{8916761,
  abstract     = {Identifying and keeping track of different structural representations of functionally overlapping issues is important in order to keep a well maintained issue management corpus, establishing efficient and organized response ability
to develop and code software patches repairing these issues and defects. This
is normally achieved by manual, time-costly reviewing-processes by special
teams put up to this task.

In this project we implement a tool using information retrieval technology,
that intends to help these teams make better and faster qualitative assessments
by providing quantitative indications in the form of similarity scores to other
artifacts within a given dataset.

This approach is inspired by a paper with a similar goal, namely detecting
duplicate issue reports. That study found that 60 % of all marked duplicates
could be found with the corresponding implementation of this approach.
Achieving similar outcomes would contribute to improved and more effective
reviewing-processes.

We use the qualitative research method of informal interviews to define the
semantic distance metric to implement. In the evaluation we mainly use a
qualitative method to assess the accuracy of it, but verify our findings with a
quantitative method. We also investigate the scalability of the tool with quantitative
methods.

As a result of the limited scope of this thesis work, the tool in its current
state will have limited use in a live development environment. However, we
conclude that this approach has a development potential and could bring fruitful
findings in the issue management and issue maintenance field if developed
further upon.},
  author       = {Nasser, Mahmoud},
  issn         = {1650-2884},
  keyword      = {information retrieval technology,semantic distances,issue management,issue maintenance,traceability link retrieval},
  language     = {eng},
  note         = {Student Paper},
  series       = {LU-CS-EX 2017-09},
  title        = {Measuring Semantic Distances between Software Artifacts to Consolidate Issues from the Development and the Field},
  year         = {2017},
}