Advanced

Implementation and evaluation of two change detection-methods applied to multivariate Gaussian data streams

Petersson, Julia (2012) FMS820 20121
Mathematical Statistics
Abstract (Swedish)
Different types of data streams, time series, are generated in systems and collected in information
bases all over the world. One may have data streams of temperatures, air pollution, share prices, traffic
demand for networks or strength of the emitted radiation from a power plant. There are different
statistical tools to analyse these data streams. An interesting tool to evaluate a data stream is to use
a change detection method. A change detection method is used to learn if there are any changes in
the stream or not. It is very useful in many areas, an example could be to analyse radiation level data
collected from a power plant. The change detection method will tell if there are any changes in the
radiation level and if there is... (More)
Different types of data streams, time series, are generated in systems and collected in information
bases all over the world. One may have data streams of temperatures, air pollution, share prices, traffic
demand for networks or strength of the emitted radiation from a power plant. There are different
statistical tools to analyse these data streams. An interesting tool to evaluate a data stream is to use
a change detection method. A change detection method is used to learn if there are any changes in
the stream or not. It is very useful in many areas, an example could be to analyse radiation level data
collected from a power plant. The change detection method will tell if there are any changes in the
radiation level and if there is an increase it could then be taken care of manually or automatically.
In this master’s thesis there are two change detection methods that are analysed and evaluated: the
Overlapping method and the Kdq-tree method. The methods are of different types and have different
approaches for finding a change/changes in a data stream. The Overlapping method is a parametric
method and the Kdq-tree method is a non-parametric method.
The exploratory analyze in this report examines and evaluates the detection performance of the two
methods in different ways: completed detection rate, detection delay, additional alarm rate with mean
and standard deviation, false alarm rate with mean and standard deviation. The tests are made on a
synthetic data set, with a multivariate Gaussian distribution, on both abrupt and linear changes.
From the results in this master’s thesis I draw the conclusions that the Kdq-tree method is a very
useful method because one does not have to know the underlying distribution of the data stream.
To be able to use the Overlapping method for change detection one have to know the underlying
distribution of the data stream which is rarely known. The Overlapping method is pretty easy to
understand and implement for a new user, while the Kdq-tree method requires more knowledge about
the method. Both methods have parameters that have to be predetermined and which strongly affects
the performance of the detection results, but the Overlapping method only has 4 parameters while the
Kdq-tree method has 6 parameters. The parameters in the Kdq-tree method are also very dependent
on each other which makes this method harder to work with. Both methods could be improved by
studying approaches for other change detection methods.
The aim of this Master’s thesis is to obtain interesting research finding that can be used in future
research. (Less)
Please use this url to cite or link to this publication:
author
Petersson, Julia
supervisor
organization
course
FMS820 20121
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Change detection, sliding windows, Kullback-Leibler distance, data streams, Bayesian parameter estimation, Kdq-tree, Bootstrapping
language
English
id
2831718
date added to LUP
2012-06-21 15:58:14
date last changed
2012-06-21 15:58:14
@misc{2831718,
  abstract     = {Different types of data streams, time series, are generated in systems and collected in information
bases all over the world. One may have data streams of temperatures, air pollution, share prices, traffic
demand for networks or strength of the emitted radiation from a power plant. There are different
statistical tools to analyse these data streams. An interesting tool to evaluate a data stream is to use
a change detection method. A change detection method is used to learn if there are any changes in
the stream or not. It is very useful in many areas, an example could be to analyse radiation level data
collected from a power plant. The change detection method will tell if there are any changes in the
radiation level and if there is an increase it could then be taken care of manually or automatically.
In this master’s thesis there are two change detection methods that are analysed and evaluated: the
Overlapping method and the Kdq-tree method. The methods are of different types and have different
approaches for finding a change/changes in a data stream. The Overlapping method is a parametric
method and the Kdq-tree method is a non-parametric method.
The exploratory analyze in this report examines and evaluates the detection performance of the two
methods in different ways: completed detection rate, detection delay, additional alarm rate with mean
and standard deviation, false alarm rate with mean and standard deviation. The tests are made on a
synthetic data set, with a multivariate Gaussian distribution, on both abrupt and linear changes.
From the results in this master’s thesis I draw the conclusions that the Kdq-tree method is a very
useful method because one does not have to know the underlying distribution of the data stream.
To be able to use the Overlapping method for change detection one have to know the underlying
distribution of the data stream which is rarely known. The Overlapping method is pretty easy to
understand and implement for a new user, while the Kdq-tree method requires more knowledge about
the method. Both methods have parameters that have to be predetermined and which strongly affects
the performance of the detection results, but the Overlapping method only has 4 parameters while the
Kdq-tree method has 6 parameters. The parameters in the Kdq-tree method are also very dependent
on each other which makes this method harder to work with. Both methods could be improved by
studying approaches for other change detection methods.
The aim of this Master’s thesis is to obtain interesting research finding that can be used in future
research.},
  author       = {Petersson, Julia},
  keyword      = {Change detection,sliding windows,Kullback-Leibler distance,data streams,Bayesian parameter estimation,Kdq-tree,Bootstrapping},
  language     = {eng},
  note         = {Student Paper},
  title        = {Implementation and evaluation of two change detection-methods applied to multivariate Gaussian data streams},
  year         = {2012},
}