Measuring Semantic Distances between Software Artifacts to Consolidate Issues from the Development and the Field
(2017) In LU-CS-EX 2017-09 EDA920 20152Department of Computer Science
- Abstract
- Identifying and keeping track of different structural representations of functionally overlapping issues is important in order to keep a well maintained issue management corpus, establishing efficient and organized response ability
to develop and code software patches repairing these issues and defects. This
is normally achieved by manual, time-costly reviewing-processes by special
teams put up to this task.
In this project we implement a tool using information retrieval technology,
that intends to help these teams make better and faster qualitative assessments
by providing quantitative indications in the form of similarity scores to other
artifacts within a given dataset.
This approach is inspired by a paper with a similar... (More) - Identifying and keeping track of different structural representations of functionally overlapping issues is important in order to keep a well maintained issue management corpus, establishing efficient and organized response ability
to develop and code software patches repairing these issues and defects. This
is normally achieved by manual, time-costly reviewing-processes by special
teams put up to this task.
In this project we implement a tool using information retrieval technology,
that intends to help these teams make better and faster qualitative assessments
by providing quantitative indications in the form of similarity scores to other
artifacts within a given dataset.
This approach is inspired by a paper with a similar goal, namely detecting
duplicate issue reports. That study found that 60 % of all marked duplicates
could be found with the corresponding implementation of this approach.
Achieving similar outcomes would contribute to improved and more effective
reviewing-processes.
We use the qualitative research method of informal interviews to define the
semantic distance metric to implement. In the evaluation we mainly use a
qualitative method to assess the accuracy of it, but verify our findings with a
quantitative method. We also investigate the scalability of the tool with quantitative
methods.
As a result of the limited scope of this thesis work, the tool in its current
state will have limited use in a live development environment. However, we
conclude that this approach has a development potential and could bring fruitful
findings in the issue management and issue maintenance field if developed
further upon. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/8916761
- author
- Nasser, Mahmoud LU
- supervisor
- organization
- course
- EDA920 20152
- year
- 2017
- type
- H3 - Professional qualifications (4 Years - )
- subject
- keywords
- information retrieval technology, semantic distances, issue management, issue maintenance, traceability link retrieval
- publication/series
- LU-CS-EX 2017-09
- report number
- LU-CS-EX 2017-09
- ISSN
- 1650-2884
- language
- English
- id
- 8916761
- date added to LUP
- 2017-06-19 09:36:46
- date last changed
- 2017-06-19 09:36:46
@misc{8916761, abstract = {{Identifying and keeping track of different structural representations of functionally overlapping issues is important in order to keep a well maintained issue management corpus, establishing efficient and organized response ability to develop and code software patches repairing these issues and defects. This is normally achieved by manual, time-costly reviewing-processes by special teams put up to this task. In this project we implement a tool using information retrieval technology, that intends to help these teams make better and faster qualitative assessments by providing quantitative indications in the form of similarity scores to other artifacts within a given dataset. This approach is inspired by a paper with a similar goal, namely detecting duplicate issue reports. That study found that 60 % of all marked duplicates could be found with the corresponding implementation of this approach. Achieving similar outcomes would contribute to improved and more effective reviewing-processes. We use the qualitative research method of informal interviews to define the semantic distance metric to implement. In the evaluation we mainly use a qualitative method to assess the accuracy of it, but verify our findings with a quantitative method. We also investigate the scalability of the tool with quantitative methods. As a result of the limited scope of this thesis work, the tool in its current state will have limited use in a live development environment. However, we conclude that this approach has a development potential and could bring fruitful findings in the issue management and issue maintenance field if developed further upon.}}, author = {{Nasser, Mahmoud}}, issn = {{1650-2884}}, language = {{eng}}, note = {{Student Paper}}, series = {{LU-CS-EX 2017-09}}, title = {{Measuring Semantic Distances between Software Artifacts to Consolidate Issues from the Development and the Field}}, year = {{2017}}, }