Analyzing Large Datasets with Mathematical Modeling and Frequency Analysis
(2014) In Master's thesis in Numerical Analysis FMN820 20142Mathematics (Faculty of Engineering)
- Abstract
- The aim of this thesis is to analyze large datasets with mathematical modeling and frequency analysis. The models used were polynomials and trigonometric polynomials. These were applied on a sample dataset with 377 data points. Another dataset with approximately 2.6 million data points was also tested with a model of first degree polynomial and trigonometric polynomials with different number of dominant frequencies included. The results noted were the execution times of each model and the norm of residuals. These showed that the most accurate model was also the one with longest execution time.
The conclusions were that trigonometric polynomials with only dominant frequencies included were more accurate than polynomials. These could also... (More) - The aim of this thesis is to analyze large datasets with mathematical modeling and frequency analysis. The models used were polynomials and trigonometric polynomials. These were applied on a sample dataset with 377 data points. Another dataset with approximately 2.6 million data points was also tested with a model of first degree polynomial and trigonometric polynomials with different number of dominant frequencies included. The results noted were the execution times of each model and the norm of residuals. These showed that the most accurate model was also the one with longest execution time.
The conclusions were that trigonometric polynomials with only dominant frequencies included were more accurate than polynomials. These could also be considered to be suitable as a mathematical model for the given dataset, since these resulted in smaller residuals.
Applications such as outlier determination and forecasting can be used with the methods tested in this thesis. (Less) - Popular Abstract
- Introduction
It is often interesting to analyze datasets, which contain data points collected to study the behavior of certain phenomenon. One way to do this is to model the dataset with mathematical models. With a mathematical model, it is possible to reproduce the dataset with a mathematical function. This mathematical function has some error threshold, since it does not describe the dataset exactly.
When a mathematical model is available, applications such as predictions about the future behavior of the phenomenon is possible. The models need to be chosen rst. If polynomials and trigonometric polynomials are chosen to represent the models, then parameters of the model can be estimated with methods such as Least Squares. Another way to... (More) - Introduction
It is often interesting to analyze datasets, which contain data points collected to study the behavior of certain phenomenon. One way to do this is to model the dataset with mathematical models. With a mathematical model, it is possible to reproduce the dataset with a mathematical function. This mathematical function has some error threshold, since it does not describe the dataset exactly.
When a mathematical model is available, applications such as predictions about the future behavior of the phenomenon is possible. The models need to be chosen rst. If polynomials and trigonometric polynomials are chosen to represent the models, then parameters of the model can be estimated with methods such as Least Squares. Another way to do this is frequency analysis, in order to obtain the coecients of the trigonometric polynomial to model the dataset. In this thesis, both methods were tested and compared.
Since a lot of real life datasets contain millions of data points, computation times become crucial. Therefore, it is also interesting to observe the time needed to compute models for the dataset, this is called the execution time.
Reed more under FILES & ACCESS. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/4862278
- author
- Jing, Eliza LU
- supervisor
- organization
- alternative title
- Analys av stora datamängder med matematisk modellering och frekvensanalys
- course
- FMN820 20142
- year
- 2014
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Numerical Analysis, Mathematical Modeling, Linear Model, Frequency Analysis, Trigonometric Polynomial, Polynomial, Fourier Series, Least Squares
- publication/series
- Master's thesis in Numerical Analysis
- report number
- LUTFNA-3032-2014
- ISSN
- 1404-6342
- other publication id
- 2014:E61
- language
- English
- id
- 4862278
- date added to LUP
- 2015-02-11 09:40:11
- date last changed
- 2015-12-14 13:32:15
@misc{4862278, abstract = {{The aim of this thesis is to analyze large datasets with mathematical modeling and frequency analysis. The models used were polynomials and trigonometric polynomials. These were applied on a sample dataset with 377 data points. Another dataset with approximately 2.6 million data points was also tested with a model of first degree polynomial and trigonometric polynomials with different number of dominant frequencies included. The results noted were the execution times of each model and the norm of residuals. These showed that the most accurate model was also the one with longest execution time. The conclusions were that trigonometric polynomials with only dominant frequencies included were more accurate than polynomials. These could also be considered to be suitable as a mathematical model for the given dataset, since these resulted in smaller residuals. Applications such as outlier determination and forecasting can be used with the methods tested in this thesis.}}, author = {{Jing, Eliza}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's thesis in Numerical Analysis}}, title = {{Analyzing Large Datasets with Mathematical Modeling and Frequency Analysis}}, year = {{2014}}, }