Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

JFeature : Know Your Corpus

Riouak, Idriss LU orcid ; Hedin, Gorel LU orcid ; Reichenbach, Christoph LU orcid and Fors, Niklas LU orcid (2022) 22nd IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2022 p.236-241
Abstract

Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications,... (More)

Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications, including longitudinal studies of individual Java projects and the creation of new corpora.

(Less)
Please use this url to cite or link to this publication:
author
; ; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
Software Corpora, Software Tools, Source-Code Analysis
host publication
Proceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
pages
6 pages
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
conference name
22nd IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
conference location
Limassol, Cyprus
conference dates
2022-10-03 - 2022-10-04
external identifiers
  • scopus:85146926569
ISBN
9781665496094
DOI
10.1109/SCAM55253.2022.00033
project
Explainable Declarative Programming Analysis
language
English
LU publication?
yes
additional info
Publisher Copyright: © 2022 IEEE.
id
46b14055-34d5-44a4-84ea-2c900866f1eb
date added to LUP
2023-02-13 11:07:56
date last changed
2023-11-21 15:44:25
@inproceedings{46b14055-34d5-44a4-84ea-2c900866f1eb,
  abstract     = {{<p>Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications, including longitudinal studies of individual Java projects and the creation of new corpora.</p>}},
  author       = {{Riouak, Idriss and Hedin, Gorel and Reichenbach, Christoph and Fors, Niklas}},
  booktitle    = {{Proceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022}},
  isbn         = {{9781665496094}},
  keywords     = {{Software Corpora; Software Tools; Source-Code Analysis}},
  language     = {{eng}},
  pages        = {{236--241}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{JFeature : Know Your Corpus}},
  url          = {{http://dx.doi.org/10.1109/SCAM55253.2022.00033}},
  doi          = {{10.1109/SCAM55253.2022.00033}},
  year         = {{2022}},
}