Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Filtering False Positive Alarms in JavaDL and Language Experience Report

Rikås, Karl-Oskar LU and Weslien, Frank (2021) In LU-CS-EX EDAM05 20211
Department of Computer Science
Abstract
JavaDL is a domain-specific language (DSL) for writing static program analyses in a declarative logic programming style, based on Datalog. The key feature of this DSL is the ability to pattern-match on literal source code syntax and reason non-locally through declarative programming.

Static program analyses generally suer from producing false positive alarms. This results in developers having to deal with unnecessary alarms. A machine learning model could mitigate this problem by filtering true alarms from false ones.

We investigate if features based on JavaDL’s pattern-matching are effective. Our results show that they are not, as the knowledge learned does not transfer over to unseen projects.

Points-to analysis is another way... (More)
JavaDL is a domain-specific language (DSL) for writing static program analyses in a declarative logic programming style, based on Datalog. The key feature of this DSL is the ability to pattern-match on literal source code syntax and reason non-locally through declarative programming.

Static program analyses generally suer from producing false positive alarms. This results in developers having to deal with unnecessary alarms. A machine learning model could mitigate this problem by filtering true alarms from false ones.

We investigate if features based on JavaDL’s pattern-matching are effective. Our results show that they are not, as the knowledge learned does not transfer over to unseen projects.

Points-to analysis is another way of improving the precision of otherwise more conservative analysis such as finding non-exhaustive switch statements in Java. As the first users of JavaDL we attempted to write a Points-to analysis, for a subset of the Java language. We report on our experience and put forth possible improvements to JavaDL in a case study. (Less)
Popular Abstract
In an increasingly digitized world, we have become ever more reliant on code.
It exists everywhere.
Not only in our smartphones but also in our microwaves, cars, and airplanes.
In April of 2019, Boeing admitted that their new 737 Max jets had a fatal flaw in the software which had caused two of its planes to crash.
Bugs happen and sometimes with deadly consequences.

One area of research, static program analysis, tries to find bugs in code before it is ever run.
Unfortunately, it is not perfect and a common complaint is that it reports too many bugs that aren't real.
Developers sometimes have to sift through a hundred alerts just to find one actual bug.
Until a few years ago there was no solution in sight.
But now, with the... (More)
In an increasingly digitized world, we have become ever more reliant on code.
It exists everywhere.
Not only in our smartphones but also in our microwaves, cars, and airplanes.
In April of 2019, Boeing admitted that their new 737 Max jets had a fatal flaw in the software which had caused two of its planes to crash.
Bugs happen and sometimes with deadly consequences.

One area of research, static program analysis, tries to find bugs in code before it is ever run.
Unfortunately, it is not perfect and a common complaint is that it reports too many bugs that aren't real.
Developers sometimes have to sift through a hundred alerts just to find one actual bug.
Until a few years ago there was no solution in sight.
But now, with the advent of machine learning, there is a promising path forward.

These powerful algorithms don't look at the world as you and I do.
All they understand are numbers.
So to be able to teach an AI to prioritize alarms for developers we first need to transform the code into a format it can understand.
We need to transform code into numbers.

We explore and prototype an algorithm to do just that: transforming code into numbers.
It looks at all the locations where bugs were found and tries to find common patterns that repeat across the source code.
Those insights are then fed into an algorithm that tries to guess which bugs are real and which are not.

Unfortunately, it is not always that you get the results that you hoped for.
Our prototype wasn't good enough.
perhaps the algorithm was too simplistic, or maybe it's not the right approach.
One possible avenue worth exploring would be to use recent advancements in code embeddings.
It lets the algorithm teach itself what is important and how it should be represented and is a powerful idea that has improved AI's understanding of the text. (Less)
Please use this url to cite or link to this publication:
author
Rikås, Karl-Oskar LU and Weslien, Frank
supervisor
organization
course
EDAM05 20211
year
type
H2 - Master's Degree (Two Years)
subject
keywords
static program analysis, alarm filtering, feature engineering
publication/series
LU-CS-EX
report number
2021-38
ISSN
1650-2884
language
English
id
9060476
date added to LUP
2021-09-01 14:32:14
date last changed
2021-09-01 14:32:14
@misc{9060476,
  abstract     = {{JavaDL is a domain-specific language (DSL) for writing static program analyses in a declarative logic programming style, based on Datalog. The key feature of this DSL is the ability to pattern-match on literal source code syntax and reason non-locally through declarative programming.

Static program analyses generally suer from producing false positive alarms. This results in developers having to deal with unnecessary alarms. A machine learning model could mitigate this problem by filtering true alarms from false ones.

We investigate if features based on JavaDL’s pattern-matching are effective. Our results show that they are not, as the knowledge learned does not transfer over to unseen projects.

Points-to analysis is another way of improving the precision of otherwise more conservative analysis such as finding non-exhaustive switch statements in Java. As the first users of JavaDL we attempted to write a Points-to analysis, for a subset of the Java language. We report on our experience and put forth possible improvements to JavaDL in a case study.}},
  author       = {{Rikås, Karl-Oskar and Weslien, Frank}},
  issn         = {{1650-2884}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{LU-CS-EX}},
  title        = {{Filtering False Positive Alarms in JavaDL and Language Experience Report}},
  year         = {{2021}},
}