Advanced

Patterns in live performance data

Svensson, Simon LU (2016) In LU-CS-EX 2016-37 EDA920 20161
Department of Computer Science
Abstract
This thesis explores methods to find features that affect the start time of mobile
apps. To help app developers improve performance over patch cycles, we
implement an alarm pipeline in Apache Spark and tested it. This implementation
is able to detect notable changes in the start time distribution and alert
the developer. Spark is a scalable cluster computing framework, that proved
well suited for the given problem.
The program consists of five steps; preprocessing the data, fitting a Gaussian
mixture model (GMM) to the data, and finding differences in the distributions.
Then a linear regression model is fit to map the available features
to the GMM parametrization. The linear regression enables the developer to
find relationships... (More)
This thesis explores methods to find features that affect the start time of mobile
apps. To help app developers improve performance over patch cycles, we
implement an alarm pipeline in Apache Spark and tested it. This implementation
is able to detect notable changes in the start time distribution and alert
the developer. Spark is a scalable cluster computing framework, that proved
well suited for the given problem.
The program consists of five steps; preprocessing the data, fitting a Gaussian
mixture model (GMM) to the data, and finding differences in the distributions.
Then a linear regression model is fit to map the available features
to the GMM parametrization. The linear regression enables the developer to
find relationships between the available features and the parametrization, by
analyzing the regression weights.
We tested the alarm pipeline on two data sets and shown that it could identify
statistically significant changes in the start-time distribution (Less)
Please use this url to cite or link to this publication:
author
Svensson, Simon LU
supervisor
organization
course
EDA920 20161
year
type
H3 - Professional qualifications (4 Years - )
subject
keywords
Android, Linear regression, Gaussian mixture models, Clustering, Performance data, Sony Mobile
publication/series
LU-CS-EX 2016-37
report number
LU-CS-EX 2016-37
ISSN
1650-2884
language
English
id
8892591
date added to LUP
2016-09-28 11:12:48
date last changed
2016-09-28 11:12:48
@misc{8892591,
  abstract     = {This thesis explores methods to find features that affect the start time of mobile
apps. To help app developers improve performance over patch cycles, we
implement an alarm pipeline in Apache Spark and tested it. This implementation
is able to detect notable changes in the start time distribution and alert
the developer. Spark is a scalable cluster computing framework, that proved
well suited for the given problem.
The program consists of five steps; preprocessing the data, fitting a Gaussian
mixture model (GMM) to the data, and finding differences in the distributions.
Then a linear regression model is fit to map the available features
to the GMM parametrization. The linear regression enables the developer to
find relationships between the available features and the parametrization, by
analyzing the regression weights.
We tested the alarm pipeline on two data sets and shown that it could identify
statistically significant changes in the start-time distribution},
  author       = {Svensson, Simon},
  issn         = {1650-2884},
  keyword      = {Android,Linear regression,Gaussian mixture models,Clustering,Performance data,Sony Mobile},
  language     = {eng},
  note         = {Student Paper},
  series       = {LU-CS-EX 2016-37},
  title        = {Patterns in live performance data},
  year         = {2016},
}