JFeature : Know Your Corpus
(2022) 22nd IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2022 p.236-241- Abstract
Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications,... (More)
Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications, including longitudinal studies of individual Java projects and the creation of new corpora.
(Less)
- author
- Riouak, Idriss LU ; Hedin, Gorel LU ; Reichenbach, Christoph LU and Fors, Niklas LU
- organization
- publishing date
- 2022
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- Software Corpora, Software Tools, Source-Code Analysis
- host publication
- Proceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
- pages
- 6 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- conference name
- 22nd IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
- conference location
- Limassol, Cyprus
- conference dates
- 2022-10-03 - 2022-10-04
- external identifiers
-
- scopus:85146926569
- ISBN
- 9781665496094
- DOI
- 10.1109/SCAM55253.2022.00033
- project
- Explainable Declarative Programming Analysis
- language
- English
- LU publication?
- yes
- additional info
- Publisher Copyright: © 2022 IEEE.
- id
- 46b14055-34d5-44a4-84ea-2c900866f1eb
- date added to LUP
- 2023-02-13 11:07:56
- date last changed
- 2023-11-21 15:44:25
@inproceedings{46b14055-34d5-44a4-84ea-2c900866f1eb, abstract = {{<p>Software corpora are crucial for evaluating research artifacts and ensuring repeatability of outcomes. Corpora such as DaCapo and Defects4J provide a collection of real-world open-source projects for evaluating the robustness and performance of software tools like static analysers. However, what do we know about these corpora? What do we know about their composition? Are they really suited for our particular problem? We developed JFEATURE, an extensible static analysis tool that extracts syntactic and semantic features from Java programs, to assist developers in answering these questions. We demonstrate the potential of JFEATURE by applying it to four widely-used corpora in the program analysis area, and we suggest other applications, including longitudinal studies of individual Java projects and the creation of new corpora.</p>}}, author = {{Riouak, Idriss and Hedin, Gorel and Reichenbach, Christoph and Fors, Niklas}}, booktitle = {{Proceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022}}, isbn = {{9781665496094}}, keywords = {{Software Corpora; Software Tools; Source-Code Analysis}}, language = {{eng}}, pages = {{236--241}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, title = {{JFeature : Know Your Corpus}}, url = {{http://dx.doi.org/10.1109/SCAM55253.2022.00033}}, doi = {{10.1109/SCAM55253.2022.00033}}, year = {{2022}}, }