Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Analyzing Large Datasets with Mathematical Modeling and Frequency Analysis

Jing, Eliza LU (2014) In Master's thesis in Numerical Analysis FMN820 20142
Mathematics (Faculty of Engineering)
Abstract
The aim of this thesis is to analyze large datasets with mathematical modeling and frequency analysis. The models used were polynomials and trigonometric polynomials. These were applied on a sample dataset with 377 data points. Another dataset with approximately 2.6 million data points was also tested with a model of first degree polynomial and trigonometric polynomials with different number of dominant frequencies included. The results noted were the execution times of each model and the norm of residuals. These showed that the most accurate model was also the one with longest execution time.

The conclusions were that trigonometric polynomials with only dominant frequencies included were more accurate than polynomials. These could also... (More)
The aim of this thesis is to analyze large datasets with mathematical modeling and frequency analysis. The models used were polynomials and trigonometric polynomials. These were applied on a sample dataset with 377 data points. Another dataset with approximately 2.6 million data points was also tested with a model of first degree polynomial and trigonometric polynomials with different number of dominant frequencies included. The results noted were the execution times of each model and the norm of residuals. These showed that the most accurate model was also the one with longest execution time.

The conclusions were that trigonometric polynomials with only dominant frequencies included were more accurate than polynomials. These could also be considered to be suitable as a mathematical model for the given dataset, since these resulted in smaller residuals.

Applications such as outlier determination and forecasting can be used with the methods tested in this thesis. (Less)
Popular Abstract
Introduction
It is often interesting to analyze datasets, which contain data points collected to study the behavior of certain phenomenon. One way to do this is to model the dataset with mathematical models. With a mathematical model, it is possible to reproduce the dataset with a mathematical function. This mathematical function has some error threshold, since it does not describe the dataset exactly.
When a mathematical model is available, applications such as predictions about the future behavior of the phenomenon is possible. The models need to be chosen rst. If polynomials and trigonometric polynomials are chosen to represent the models, then parameters of the model can be estimated with methods such as Least Squares. Another way to... (More)
Introduction
It is often interesting to analyze datasets, which contain data points collected to study the behavior of certain phenomenon. One way to do this is to model the dataset with mathematical models. With a mathematical model, it is possible to reproduce the dataset with a mathematical function. This mathematical function has some error threshold, since it does not describe the dataset exactly.
When a mathematical model is available, applications such as predictions about the future behavior of the phenomenon is possible. The models need to be chosen rst. If polynomials and trigonometric polynomials are chosen to represent the models, then parameters of the model can be estimated with methods such as Least Squares. Another way to do this is frequency analysis, in order to obtain the coecients of the trigonometric polynomial to model the dataset. In this thesis, both methods were tested and compared.
Since a lot of real life datasets contain millions of data points, computation times become crucial. Therefore, it is also interesting to observe the time needed to compute models for the dataset, this is called the execution time.

Reed more under FILES & ACCESS. (Less)
Please use this url to cite or link to this publication:
author
Jing, Eliza LU
supervisor
organization
alternative title
Analys av stora datamängder med matematisk modellering och frekvensanalys
course
FMN820 20142
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Numerical Analysis, Mathematical Modeling, Linear Model, Frequency Analysis, Trigonometric Polynomial, Polynomial, Fourier Series, Least Squares
publication/series
Master's thesis in Numerical Analysis
report number
LUTFNA-3032-2014
ISSN
1404-6342
other publication id
2014:E61
language
English
id
4862278
date added to LUP
2015-02-11 09:40:11
date last changed
2015-12-14 13:32:15
@misc{4862278,
  abstract     = {{The aim of this thesis is to analyze large datasets with mathematical modeling and frequency analysis. The models used were polynomials and trigonometric polynomials. These were applied on a sample dataset with 377 data points. Another dataset with approximately 2.6 million data points was also tested with a model of first degree polynomial and trigonometric polynomials with different number of dominant frequencies included. The results noted were the execution times of each model and the norm of residuals. These showed that the most accurate model was also the one with longest execution time.

The conclusions were that trigonometric polynomials with only dominant frequencies included were more accurate than polynomials. These could also be considered to be suitable as a mathematical model for the given dataset, since these resulted in smaller residuals. 

Applications such as outlier determination and forecasting can be used with the methods tested in this thesis.}},
  author       = {{Jing, Eliza}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's thesis in Numerical Analysis}},
  title        = {{Analyzing Large Datasets with Mathematical Modeling and Frequency Analysis}},
  year         = {{2014}},
}