Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Selecting explanatory variables with the modified version of the bayesian information criterion

Bogdan, Malgorzata LU ; Ghosh, Jayanta K. and Zak-Szatkowska, Małgorzata (2008) In Quality and Reliability Engineering International 24(6). p.627-641
Abstract

We consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. In our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well-known Bonferroni correction for multiple testing and the Bayes oracle,... (More)

We consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. In our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well-known Bonferroni correction for multiple testing and the Bayes oracle, which minimizes the expected costs of inference. We use computer simulations and a real data analysis to illustrate the performance of the original mBIC and its rank version, which is designed to deal with data that contain some outlying observations.

(Less)
Please use this url to cite or link to this publication:
author
; and
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Bayes oracle, Data mining, Model selection, Multiple regression, Multiple testing
in
Quality and Reliability Engineering International
volume
24
issue
6
pages
15 pages
publisher
John Wiley & Sons Inc.
external identifiers
  • scopus:54949098146
ISSN
0748-8017
DOI
10.1002/qre.936
language
English
LU publication?
no
id
98d1dffc-5021-44aa-8b51-9f8826fcf038
date added to LUP
2023-12-08 09:23:22
date last changed
2023-12-11 10:31:46
@article{98d1dffc-5021-44aa-8b51-9f8826fcf038,
  abstract     = {{<p>We consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. In our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well-known Bonferroni correction for multiple testing and the Bayes oracle, which minimizes the expected costs of inference. We use computer simulations and a real data analysis to illustrate the performance of the original mBIC and its rank version, which is designed to deal with data that contain some outlying observations.</p>}},
  author       = {{Bogdan, Malgorzata and Ghosh, Jayanta K. and Zak-Szatkowska, Małgorzata}},
  issn         = {{0748-8017}},
  keywords     = {{Bayes oracle; Data mining; Model selection; Multiple regression; Multiple testing}},
  language     = {{eng}},
  number       = {{6}},
  pages        = {{627--641}},
  publisher    = {{John Wiley & Sons Inc.}},
  series       = {{Quality and Reliability Engineering International}},
  title        = {{Selecting explanatory variables with the modified version of the bayesian information criterion}},
  url          = {{http://dx.doi.org/10.1002/qre.936}},
  doi          = {{10.1002/qre.936}},
  volume       = {{24}},
  year         = {{2008}},
}