Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Predicting Corporate Bankruptcy in Sweden: A Machine Learning Approach Combining Financial and Nonfinancial Data

Pasichnyk, Viktorija LU and Reinders, Fynn Noah LU (2025) DABN01 20251
Department of Economics
Department of Statistics
Abstract
Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results... (More)
Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results show that financial models alone reach an accuracy of 0.908, but when textual information is included, the AUC increases from 0.970 to 0.990, demonstrating a clear performance improvement. Feature importance is examined using SHAP and permutation importance, revealing that features from textual analysis contribute significantly to predictive accuracy. (Less)
Popular Abstract
Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results... (More)
Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results show that financial models alone reach an accuracy of 0.908, but when textual information is included, the AUC increases from 0.970 to 0.990, demonstrating a clear performance improvement. Feature importance is examined using SHAP and permutation importance, revealing that features from textual analysis contribute significantly to predictive accuracy. (Less)
Please use this url to cite or link to this publication:
author
Pasichnyk, Viktorija LU and Reinders, Fynn Noah LU
supervisor
organization
course
DABN01 20251
year
type
H1 - Master's Degree (One Year)
subject
keywords
Bankruptcy Prediction, Financial Ratios, Machine Learning, Textual Analysis, Audit Reports, Swedish BERT, Conformal Prediction, SHAP, Permutation Importance
language
English
id
9191987
date added to LUP
2025-09-12 09:05:12
date last changed
2025-09-12 09:05:12
@misc{9191987,
  abstract     = {{Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results show that financial models alone reach an accuracy of 0.908, but when textual information is included, the AUC increases from 0.970 to 0.990, demonstrating a clear performance improvement. Feature importance is examined using SHAP and permutation importance, revealing that features from textual analysis contribute significantly to predictive accuracy.}},
  author       = {{Pasichnyk, Viktorija and Reinders, Fynn Noah}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Predicting Corporate Bankruptcy in Sweden: A Machine Learning Approach Combining Financial and Nonfinancial Data}},
  year         = {{2025}},
}