Patterns in live performance data
(2016) In LU-CS-EX 2016-37 EDA920 20161Department of Computer Science
- Abstract
- This thesis explores methods to find features that affect the start time of mobile
apps. To help app developers improve performance over patch cycles, we
implement an alarm pipeline in Apache Spark and tested it. This implementation
is able to detect notable changes in the start time distribution and alert
the developer. Spark is a scalable cluster computing framework, that proved
well suited for the given problem.
The program consists of five steps; preprocessing the data, fitting a Gaussian
mixture model (GMM) to the data, and finding differences in the distributions.
Then a linear regression model is fit to map the available features
to the GMM parametrization. The linear regression enables the developer to
find relationships... (More) - This thesis explores methods to find features that affect the start time of mobile
apps. To help app developers improve performance over patch cycles, we
implement an alarm pipeline in Apache Spark and tested it. This implementation
is able to detect notable changes in the start time distribution and alert
the developer. Spark is a scalable cluster computing framework, that proved
well suited for the given problem.
The program consists of five steps; preprocessing the data, fitting a Gaussian
mixture model (GMM) to the data, and finding differences in the distributions.
Then a linear regression model is fit to map the available features
to the GMM parametrization. The linear regression enables the developer to
find relationships between the available features and the parametrization, by
analyzing the regression weights.
We tested the alarm pipeline on two data sets and shown that it could identify
statistically significant changes in the start-time distribution (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/8892591
- author
- Svensson, Simon LU
- supervisor
- organization
- course
- EDA920 20161
- year
- 2016
- type
- H3 - Professional qualifications (4 Years - )
- subject
- keywords
- Android, Linear regression, Gaussian mixture models, Clustering, Performance data, Sony Mobile
- publication/series
- LU-CS-EX 2016-37
- report number
- LU-CS-EX 2016-37
- ISSN
- 1650-2884
- language
- English
- id
- 8892591
- date added to LUP
- 2016-09-28 11:12:48
- date last changed
- 2016-09-28 11:12:48
@misc{8892591, abstract = {{This thesis explores methods to find features that affect the start time of mobile apps. To help app developers improve performance over patch cycles, we implement an alarm pipeline in Apache Spark and tested it. This implementation is able to detect notable changes in the start time distribution and alert the developer. Spark is a scalable cluster computing framework, that proved well suited for the given problem. The program consists of five steps; preprocessing the data, fitting a Gaussian mixture model (GMM) to the data, and finding differences in the distributions. Then a linear regression model is fit to map the available features to the GMM parametrization. The linear regression enables the developer to find relationships between the available features and the parametrization, by analyzing the regression weights. We tested the alarm pipeline on two data sets and shown that it could identify statistically significant changes in the start-time distribution}}, author = {{Svensson, Simon}}, issn = {{1650-2884}}, language = {{eng}}, note = {{Student Paper}}, series = {{LU-CS-EX 2016-37}}, title = {{Patterns in live performance data}}, year = {{2016}}, }