Implementation and evaluation of two change detectionmethods applied to multivariate Gaussian data streams
(2012) FMS820 20121Mathematical Statistics
 Abstract (Swedish)
 Different types of data streams, time series, are generated in systems and collected in information
bases all over the world. One may have data streams of temperatures, air pollution, share prices, traffic
demand for networks or strength of the emitted radiation from a power plant. There are different
statistical tools to analyse these data streams. An interesting tool to evaluate a data stream is to use
a change detection method. A change detection method is used to learn if there are any changes in
the stream or not. It is very useful in many areas, an example could be to analyse radiation level data
collected from a power plant. The change detection method will tell if there are any changes in the
radiation level and if there is... (More)  Different types of data streams, time series, are generated in systems and collected in information
bases all over the world. One may have data streams of temperatures, air pollution, share prices, traffic
demand for networks or strength of the emitted radiation from a power plant. There are different
statistical tools to analyse these data streams. An interesting tool to evaluate a data stream is to use
a change detection method. A change detection method is used to learn if there are any changes in
the stream or not. It is very useful in many areas, an example could be to analyse radiation level data
collected from a power plant. The change detection method will tell if there are any changes in the
radiation level and if there is an increase it could then be taken care of manually or automatically.
In this master’s thesis there are two change detection methods that are analysed and evaluated: the
Overlapping method and the Kdqtree method. The methods are of different types and have different
approaches for finding a change/changes in a data stream. The Overlapping method is a parametric
method and the Kdqtree method is a nonparametric method.
The exploratory analyze in this report examines and evaluates the detection performance of the two
methods in different ways: completed detection rate, detection delay, additional alarm rate with mean
and standard deviation, false alarm rate with mean and standard deviation. The tests are made on a
synthetic data set, with a multivariate Gaussian distribution, on both abrupt and linear changes.
From the results in this master’s thesis I draw the conclusions that the Kdqtree method is a very
useful method because one does not have to know the underlying distribution of the data stream.
To be able to use the Overlapping method for change detection one have to know the underlying
distribution of the data stream which is rarely known. The Overlapping method is pretty easy to
understand and implement for a new user, while the Kdqtree method requires more knowledge about
the method. Both methods have parameters that have to be predetermined and which strongly affects
the performance of the detection results, but the Overlapping method only has 4 parameters while the
Kdqtree method has 6 parameters. The parameters in the Kdqtree method are also very dependent
on each other which makes this method harder to work with. Both methods could be improved by
studying approaches for other change detection methods.
The aim of this Master’s thesis is to obtain interesting research finding that can be used in future
research. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/studentpapers/record/2831718
 author
 Petersson, Julia
 supervisor

 Jimmy Olsson ^{LU}
 organization
 course
 FMS820 20121
 year
 2012
 type
 H2  Master's Degree (Two Years)
 subject
 keywords
 Change detection, sliding windows, KullbackLeibler distance, data streams, Bayesian parameter estimation, Kdqtree, Bootstrapping
 language
 English
 id
 2831718
 date added to LUP
 20120621 15:58:14
 date last changed
 20120621 15:58:14
@misc{2831718, abstract = {Different types of data streams, time series, are generated in systems and collected in information bases all over the world. One may have data streams of temperatures, air pollution, share prices, traffic demand for networks or strength of the emitted radiation from a power plant. There are different statistical tools to analyse these data streams. An interesting tool to evaluate a data stream is to use a change detection method. A change detection method is used to learn if there are any changes in the stream or not. It is very useful in many areas, an example could be to analyse radiation level data collected from a power plant. The change detection method will tell if there are any changes in the radiation level and if there is an increase it could then be taken care of manually or automatically. In this master’s thesis there are two change detection methods that are analysed and evaluated: the Overlapping method and the Kdqtree method. The methods are of different types and have different approaches for finding a change/changes in a data stream. The Overlapping method is a parametric method and the Kdqtree method is a nonparametric method. The exploratory analyze in this report examines and evaluates the detection performance of the two methods in different ways: completed detection rate, detection delay, additional alarm rate with mean and standard deviation, false alarm rate with mean and standard deviation. The tests are made on a synthetic data set, with a multivariate Gaussian distribution, on both abrupt and linear changes. From the results in this master’s thesis I draw the conclusions that the Kdqtree method is a very useful method because one does not have to know the underlying distribution of the data stream. To be able to use the Overlapping method for change detection one have to know the underlying distribution of the data stream which is rarely known. The Overlapping method is pretty easy to understand and implement for a new user, while the Kdqtree method requires more knowledge about the method. Both methods have parameters that have to be predetermined and which strongly affects the performance of the detection results, but the Overlapping method only has 4 parameters while the Kdqtree method has 6 parameters. The parameters in the Kdqtree method are also very dependent on each other which makes this method harder to work with. Both methods could be improved by studying approaches for other change detection methods. The aim of this Master’s thesis is to obtain interesting research finding that can be used in future research.}, author = {Petersson, Julia}, keyword = {Change detection,sliding windows,KullbackLeibler distance,data streams,Bayesian parameter estimation,Kdqtree,Bootstrapping}, language = {eng}, note = {Student Paper}, title = {Implementation and evaluation of two change detectionmethods applied to multivariate Gaussian data streams}, year = {2012}, }