On the Challenges of Software Performance Optimization with Statistical Methods

Couderc, Noric

On the Challenges of Software Performance Optimization with Statistical Methods

Mark

Couderc, Noric ^LU

(2023)

Abstract: Most recent programming languages, such as Java, Python and Ruby, include a collection framework as part of their standard library (or runtime). The Java Collection Framework provides a number of collection classes, some of which implement the same abstract data type, making them interchangeable. Developers can therefore choose between several functionally equivalent options. Since collections have different performance characteristics, and can be allocated in thousands of programs locations, the choice of collection has an important impact on performance. Unfortunately, programmers often make sub-optimal choices when selecting their collections.

In this thesis, we consider the problem of building automated tools that would help... (More); Most recent programming languages, such as Java, Python and Ruby, include a collection framework as part of their standard library (or runtime). The Java Collection Framework provides a number of collection classes, some of which implement the same abstract data type, making them interchangeable. Developers can therefore choose between several functionally equivalent options. Since collections have different performance characteristics, and can be allocated in thousands of programs locations, the choice of collection has an important impact on performance. Unfortunately, programmers often make sub-optimal choices when selecting their collections.

In this thesis, we consider the problem of building automated tools that would help the programmer choose between different collection implementations. We divide this problem into two sub-problems. First, we need to measure the performance of a collection, and use relevant statistical methods to make meaningful comparisons. Second, we need to predict the performance of a collection with as little benchmarking as possible.

To measure and analyze the performance of Java collections, we identify problems with the established methods, and suggest the need for more appropriate statistical methods, borrowed from Bayesian statistics. We use these statistical methods in a reproduction of two state-of-the-art dynamic collection selection approaches: CoCo and CollectionSwitch. Our Bayesian approach allows us to make sound comparisons between the previously reported results and our own experimental evidence.
We find that we cannot reproduce the original results, and report on possible causes for the discrepancies between our results and theirs.

To predict the performance of a collection, we consider an existing tool called Brainy. Brainy suggests collections to developers for C++ programs, using machine learning. One particularity of Brainy is that it generates its own training data, by synthesizing programs and benchmarking them. As a result Brainy can automatically learn about new collections and new CPU architectures, while other approaches required an expert to teach the system about collection performance. We adapt Brainy to the Java context, and investigate whether Brainy's adaptability also holds for Java. We find that Brainy's benchmark synthesis methods do not apply well to the Java context, as they introduce some significant biases. We propose a new generative model for collection benchmarks and present the challenges that porting Brainy to Java entails.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/285d8a2d-79ea-488f-96dd-dbe768ac5ed4

author

Couderc, Noric ^LU

supervisor

Christoph Reichenbach ^LU
Emma Söderberg ^LU

opponent

Assistant Prof. Costa, Diego, Université du Québec á Montréal, Canada.

organization

publishing date

2023-05-17

type

Thesis

publication status

published

subject

Computer Sciences

keywords

Software Engineering, Bayesian Statistics, Machine Learning, software performance

publisher

Department of Computer Science, Lund University

defense location

Lecture Hall E:A, building E, Ole Römers väg 3, Faculty of Engineering LTH, Lund University, Lund.

defense date

2023-06-15 15:15:00

ISBN

978-91-8039-691-2

978-91-8039-692-9

language

English

LU publication?

yes

id

285d8a2d-79ea-488f-96dd-dbe768ac5ed4

date added to LUP

2023-05-15 10:55:08

date last changed

2025-04-04 13:54:57

@phdthesis{285d8a2d-79ea-488f-96dd-dbe768ac5ed4,
  abstract     = {{Most recent programming languages, such as Java, Python and Ruby, include a collection framework as part of their standard library (or runtime). The Java Collection Framework provides a number of collection classes, some of which implement the same abstract data type, making them interchangeable. Developers can therefore choose between several functionally equivalent options. Since collections have different performance characteristics, and can be allocated in thousands of programs locations, the choice of collection has an important impact on performance. Unfortunately, programmers often make sub-optimal choices when selecting their collections.<br/><br/>In this thesis, we consider the problem of building automated tools that would help the programmer choose between different collection implementations. We divide this problem into two sub-problems. First, we need to measure the performance of a collection, and use relevant statistical methods to make meaningful comparisons. Second, we need to predict the performance of a collection with as little benchmarking as possible.<br/><br/>To measure and analyze the performance of Java collections, we identify problems with the established methods, and suggest the need for more appropriate statistical methods, borrowed from Bayesian statistics. We use these statistical methods in a reproduction of two state-of-the-art dynamic collection selection approaches: CoCo and CollectionSwitch. Our Bayesian approach allows us to make sound comparisons between the previously reported results and our own experimental evidence.<br/>We find that we cannot reproduce the original results, and report on possible causes for the discrepancies between our results and theirs.<br/><br/>To predict the performance of a collection, we consider an existing tool called Brainy. Brainy suggests collections to developers for C++ programs, using machine learning. One particularity of Brainy is that it generates its own training data, by synthesizing programs and benchmarking them. As a result Brainy can automatically learn about new collections and new CPU architectures, while other approaches required an expert to teach the system about collection performance. We adapt Brainy to the Java context, and investigate whether Brainy's adaptability also holds for Java. We find that Brainy's benchmark synthesis methods do not apply well to the Java context, as they introduce some significant biases. We propose a new generative model for collection benchmarks and present the challenges that porting Brainy to Java entails.<br/>}},
  author       = {{Couderc, Noric}},
  isbn         = {{978-91-8039-691-2}},
  keywords     = {{Software Engineering; Bayesian Statistics; Machine Learning; software performance}},
  language     = {{eng}},
  month        = {{05}},
  publisher    = {{Department of Computer Science, Lund University}},
  school       = {{Lund University}},
  title        = {{On the Challenges of Software Performance Optimization with Statistical Methods}},
  url          = {{https://lup.lub.lu.se/search/files/146571394/thesis.pdf}},
  year         = {{2023}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

On the Challenges of Software Performance Optimization with Statistical Methods