Advanced

En analys av förväntad medellivslängd i världens länder 1996–97

Alinaghizadeh Mollasaraie, Hassan (2009)
Department of Statistics
Abstract
Many studies show that socioeconomic and demographic variables have a strong link with human health which in turn affects life expectancy and age. This paper presents the results of a principal component regression analysis, in order to describe a good regression model for mean age of the population of all the countries in the world. The data are from “Philips Geographical Digest” and “World Resources” Annuals 96–97). This paper does not search for individual explanatory variables, but a multivariate model in which the entire variables (16 variables) are included in the same model. Use of all variables at the same time create a multicollinearity problem, therefore, principal component analysis were selected as the main analytical... (More)
Many studies show that socioeconomic and demographic variables have a strong link with human health which in turn affects life expectancy and age. This paper presents the results of a principal component regression analysis, in order to describe a good regression model for mean age of the population of all the countries in the world. The data are from “Philips Geographical Digest” and “World Resources” Annuals 96–97). This paper does not search for individual explanatory variables, but a multivariate model in which the entire variables (16 variables) are included in the same model. Use of all variables at the same time create a multicollinearity problem, therefore, principal component analysis were selected as the main analytical techniques to avoid reduction of data. The results of analysis show that we can use a simple regres-sion model with only one principal component as the explanatory variable. This is a general assessment of all the selected variables in one simple model from an statistical point of view, and occasional descriptive statistics are presented with mean, SE and intercorrelation matrix to improve multicollinearity. The analysis starts with the classical method of multiple regressions, followed by ridge regression, principal component analysis and an analysis of the correlation’s structure by “Gabriel’s biplot” to detect the variables that have less impact in this context. To avoid problems with multicollinearity, “ridge regression” was used as the first alternative, but ridge regression did not solve the problem with multicollinearity because all the explanatory variables were strongly correlated. The results of ridge regression and multiple regression models were compared with results of principal component regression (pc1, pc1-pc4). Results from the different methods were com-pared by the determination coefficients R2 and R2(pred.). PROC REG, PROC PRIN-COMP in SAS version 9.2 is used to analyze the data. (Less)
Please use this url to cite or link to this publication:
@misc{1848827,
  abstract     = {Many studies show that socioeconomic and demographic variables have a strong link with human health which in turn affects life expectancy and age. This paper presents the results of a principal component regression analysis, in order to describe a good regression model for mean age of the population of all the countries in the world. The data are from “Philips Geographical Digest” and “World Resources” Annuals 96–97). This paper does not search for individual explanatory variables, but a multivariate model in which the entire variables (16 variables) are included in the same model. Use of all variables at the same time create a multicollinearity problem, therefore, principal component analysis were selected as the main analytical techniques to avoid reduction of data. The results of analysis show that we can use a simple regres-sion model with only one principal component as the explanatory variable. This is a general assessment of all the selected variables in one simple model from an statistical point of view, and occasional descriptive statistics are presented with mean, SE and intercorrelation matrix to improve multicollinearity. The analysis starts with the classical method of multiple regressions, followed by ridge regression, principal component analysis and an analysis of the correlation’s structure by “Gabriel’s biplot” to detect the variables that have less impact in this context. To avoid problems with multicollinearity, “ridge regression” was used as the first alternative, but ridge regression did not solve the problem with multicollinearity because all the explanatory variables were strongly correlated. The results of ridge regression and multiple regression models were compared with results of principal component regression (pc1, pc1-pc4). Results from the different methods were com-pared by the determination coefficients R2 and R2(pred.). PROC REG, PROC PRIN-COMP in SAS version 9.2 is used to analyze the data.},
  author       = {Alinaghizadeh Mollasaraie, Hassan},
  keyword      = {multipel regression,Multikollinjäritet,Kondition index,Principalkomponentanalys,Ridge regression,Variance Inflation Factor,Predicted residual,Determinationskoefficienter,Statistics, operations research, programming, actuarial mathematics,Statistik, operationsanalys, programmering, aktuariematematik},
  language     = {swe},
  note         = {Student Paper},
  title        = {En analys av förväntad medellivslängd i världens länder 1996–97},
  year         = {2009},
}