Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Analyzing effects of thylakoid diet using multivariate statistical methods

Pejicic, Sasa LU and Andersson, Filip LU (2013) STAH11 20122
Department of Statistics
Abstract
The main purpose of this thesis is to demonstrate how methods of multivariate statistical analysis could be used to detect possible differences between two groups of subjects for which multivariate observations are collected. In particular, we show that the multivariate methods can be more sensitive at detecting significant differences by contrasting the results with the ones based on univariate statistical methods. Our discussion is based on a particular data set that has been obtained from a study involving a special diet administered to humans during an eight-week period. Every two weeks all participants were examined, which has resulted in a data set partly used in this thesis. The quantitative data obtained from this study has been... (More)
The main purpose of this thesis is to demonstrate how methods of multivariate statistical analysis could be used to detect possible differences between two groups of subjects for which multivariate observations are collected. In particular, we show that the multivariate methods can be more sensitive at detecting significant differences by contrasting the results with the ones based on univariate statistical methods. Our discussion is based on a particular data set that has been obtained from a study involving a special diet administered to humans during an eight-week period. Every two weeks all participants were examined, which has resulted in a data set partly used in this thesis. The quantitative data obtained from this study has been used to detect if there are differences among some health related variables, such as body fat and blood glucose, between a group of people administered with thylakoids and a control group.

The data is first analyzed using univariate methods and then by introducing a multivariate model. Both univariate and multivariate analyses are performed for the differences between values at the beginning and at the end of the study, and then for the slopes of regressions fitted to the data counting for all time points. An initial Principal Component Analysis is also performed to obtain a better insight into which variables or group of variables are most responsible for variability within the samples. Based on results some suggestions for further studies are proposed.

The initial analysis has been performed on the difference data (difference between values at the end and at the beginning of the experiment). From the univariate model we find out that a couple of the variables differ significantly from each other. However, this result does not take the multiple hypothesis testing issue into account. When Bonferroni correction is used, as the proper approach would dictate, the results are not significant any more. Analyzing the data in a fully multivariate manner does not produce significant results either.

Subsequently, we enhance our analysis by doing regression including all time points and treat the obtained slopes a new data set. When doing the same tests as before, but on the slope data, we now get different results. For the transformed data some of the variables exhibit significance in univariate tests partially agreeing with our initial finding for difference data. Similarly as before, significance of these results does not take the multiple testing issues into account and adding Bonferroni's correction again produces non-significant results. However, and this is in contrast to the result from difference data, performing a fully multivariate analysis for the slope variables produces significant results. Thus, accounting for the fully multivariate character of the data improves the sensitivity of detection of significant effects. Consequently, the presented data set illustrates a frequently neglected fact that time dependent multivariate data should be approached through truly multivariate analysis instead of the ones adopted from a univariate approach. Finally, we observe that, due to the complex dependence structure of the variables and its potential connection with interpretation of the results, further advanced multivariate analyses such as PCA and Factor Analysis are recommended to fully understand the information carried by the data. (Less)
Please use this url to cite or link to this publication:
author
Pejicic, Sasa LU and Andersson, Filip LU
supervisor
organization
course
STAH11 20122
year
type
M2 - Bachelor Degree
subject
language
English
id
3633938
date added to LUP
2013-05-23 14:01:01
date last changed
2013-05-23 14:01:01
@misc{3633938,
  abstract     = {The main purpose of this thesis is to demonstrate how methods of multivariate statistical analysis could be used to detect possible differences between two groups of subjects for which multivariate observations are collected. In particular, we show that the multivariate methods can be more sensitive at detecting significant differences by contrasting the results with the ones based on univariate statistical methods. Our discussion is based on a particular data set that has been obtained from a study involving a special diet administered to humans during an eight-week period. Every two weeks all participants were examined, which has resulted in a data set partly used in this thesis. The quantitative data obtained from this study has been used to detect if there are differences among some health related variables, such as body fat and blood glucose, between a group of people administered with thylakoids and a control group.

The data is first analyzed using univariate methods and then by introducing a multivariate model. Both univariate and multivariate analyses are performed for the differences between values at the beginning and at the end of the study, and then for the slopes of regressions fitted to the data counting for all time points. An initial Principal Component Analysis is also performed to obtain a better insight into which variables or group of variables are most responsible for variability within the samples. Based on results some suggestions for further studies are proposed.

The initial analysis has been performed on the difference data (difference between values at the end and at the beginning of the experiment). From the univariate model we find out that a couple of the variables differ significantly from each other. However, this result does not take the multiple hypothesis testing issue into account. When Bonferroni correction is used, as the proper approach would dictate, the results are not significant any more. Analyzing the data in a fully multivariate manner does not produce significant results either.

Subsequently, we enhance our analysis by doing regression including all time points and treat the obtained slopes a new data set. When doing the same tests as before, but on the slope data, we now get different results. For the transformed data some of the variables exhibit significance in univariate tests partially agreeing with our initial finding for difference data. Similarly as before, significance of these results does not take the multiple testing issues into account and adding Bonferroni's correction again produces non-significant results. However, and this is in contrast to the result from difference data, performing a fully multivariate analysis for the slope variables produces significant results. Thus, accounting for the fully multivariate character of the data improves the sensitivity of detection of significant effects. Consequently, the presented data set illustrates a frequently neglected fact that time dependent multivariate data should be approached through truly multivariate analysis instead of the ones adopted from a univariate approach. Finally, we observe that, due to the complex dependence structure of the variables and its potential connection with interpretation of the results, further advanced multivariate analyses such as PCA and Factor Analysis are recommended to fully understand the information carried by the data.},
  author       = {Pejicic, Sasa and Andersson, Filip},
  language     = {eng},
  note         = {Student Paper},
  title        = {Analyzing effects of thylakoid diet using multivariate statistical methods},
  year         = {2013},
}