Predicting Corporate Bankruptcy in Sweden: A Machine Learning Approach Combining Financial and Nonfinancial Data
(2025) DABN01 20251Department of Economics
Department of Statistics
- Abstract
- Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results... (More)
- Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results show that financial models alone reach an accuracy of 0.908, but when textual information is included, the AUC increases from 0.970 to 0.990, demonstrating a clear performance improvement. Feature importance is examined using SHAP and permutation importance, revealing that features from textual analysis contribute significantly to predictive accuracy. (Less)
- Popular Abstract
- Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results... (More)
- Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results show that financial models alone reach an accuracy of 0.908, but when textual information is included, the AUC increases from 0.970 to 0.990, demonstrating a clear performance improvement. Feature importance is examined using SHAP and permutation importance, revealing that features from textual analysis contribute significantly to predictive accuracy. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9191987
- author
- Pasichnyk, Viktorija LU and Reinders, Fynn Noah LU
- supervisor
- organization
- course
- DABN01 20251
- year
- 2025
- type
- H1 - Master's Degree (One Year)
- subject
- keywords
- Bankruptcy Prediction, Financial Ratios, Machine Learning, Textual Analysis, Audit Reports, Swedish BERT, Conformal Prediction, SHAP, Permutation Importance
- language
- English
- id
- 9191987
- date added to LUP
- 2025-09-12 09:05:12
- date last changed
- 2025-09-12 09:05:12
@misc{9191987, abstract = {{Company bankruptcy can have severe consequences for stakeholders and the broader economy, making early and accurate prediction essential. This thesis investigates the effectiveness of integrating financial and textual data for bankruptcy prediction among Swedish companies. Financial ratios are extracted from structured statements in the Retriever database, while unstructured textual data is retrieved from audit reports. Machine learning models are trained on financial features to classify firms as bankrupt or healthy. A Swedish BERT model analyzes audit reports to generate textual bankruptcy probability scores to complement this. These are then combined with financial data to assess whether textual analysis improves prediction. Results show that financial models alone reach an accuracy of 0.908, but when textual information is included, the AUC increases from 0.970 to 0.990, demonstrating a clear performance improvement. Feature importance is examined using SHAP and permutation importance, revealing that features from textual analysis contribute significantly to predictive accuracy.}}, author = {{Pasichnyk, Viktorija and Reinders, Fynn Noah}}, language = {{eng}}, note = {{Student Paper}}, title = {{Predicting Corporate Bankruptcy in Sweden: A Machine Learning Approach Combining Financial and Nonfinancial Data}}, year = {{2025}}, }