Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Nonstandard Errors in Bank Default Prediction Using Machine Learning

Svalfors, Emil LU (2024) NEKH01 20241
Department of Economics
Abstract
This thesis analyses the risk of nonstandard errors affecting bank prediction using machine learning. Nonstandard errors are defined as the type of errors that occur during the Evidence Generating Process (EGP), meaning that these occur as a consequence of decision-making by researchers, rather than from sampling. The aim is to analyze how different choices of methods for pre-processing and data engineering create variation in the prediction performance of a classifier, hence signifying the existence of nonstandard error. This is done by creating 20 different pre-processing and data engineering pipelines consisting of different choices of methods. The variation in the performance of the different pipelines then gives an estimate of the... (More)
This thesis analyses the risk of nonstandard errors affecting bank prediction using machine learning. Nonstandard errors are defined as the type of errors that occur during the Evidence Generating Process (EGP), meaning that these occur as a consequence of decision-making by researchers, rather than from sampling. The aim is to analyze how different choices of methods for pre-processing and data engineering create variation in the prediction performance of a classifier, hence signifying the existence of nonstandard error. This is done by creating 20 different pre-processing and data engineering pipelines consisting of different choices of methods. The variation in the performance of the different pipelines then gives an estimate of the nonstandard errors. By using recall, precision and ROC-AUC as scoring metrics, this thesis finds that the size of the nonstandard errors are smaller than the standard errors for recall and ROC-AUC. For precision, the nonstandard errors are larger. Overall, the size of the errors across the metrics were of similar magnitude. This thesis concludes that nonstandard errors are likely to affect predictions of bank defaults. The implication of this is that researchers always need to be aware of nonstandard errors and compare as many parts of the machine learning pipeline as possible. (Less)
Please use this url to cite or link to this publication:
author
Svalfors, Emil LU
supervisor
organization
course
NEKH01 20241
year
type
M2 - Bachelor Degree
subject
keywords
Nonstandard Errors, Machine Learning, Outlier Detection, Resampling, Feature Selection
language
English
id
9159134
date added to LUP
2024-06-12 14:11:24
date last changed
2024-06-12 14:11:24
@misc{9159134,
  abstract     = {{This thesis analyses the risk of nonstandard errors affecting bank prediction using machine learning. Nonstandard errors are defined as the type of errors that occur during the Evidence Generating Process (EGP), meaning that these occur as a consequence of decision-making by researchers, rather than from sampling. The aim is to analyze how different choices of methods for pre-processing and data engineering create variation in the prediction performance of a classifier, hence signifying the existence of nonstandard error. This is done by creating 20 different pre-processing and data engineering pipelines consisting of different choices of methods. The variation in the performance of the different pipelines then gives an estimate of the nonstandard errors. By using recall, precision and ROC-AUC as scoring metrics, this thesis finds that the size of the nonstandard errors are smaller than the standard errors for recall and ROC-AUC. For precision, the nonstandard errors are larger. Overall, the size of the errors across the metrics were of similar magnitude. This thesis concludes that nonstandard errors are likely to affect predictions of bank defaults. The implication of this is that researchers always need to be aware of nonstandard errors and compare as many parts of the machine learning pipeline as possible.}},
  author       = {{Svalfors, Emil}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Nonstandard Errors in Bank Default Prediction Using Machine Learning}},
  year         = {{2024}},
}