This thesis investigates the effectiveness of machine learning models in predicting bankruptcy among Italian companies, leveraging financial ratios derived from balance sheets, income statements, and cash flow statements. By analyzing data from 255,919 companies, including 28,360 bankrupt firms, spanning 1996 to 2023, this study addresses the critical need for accurate predictive tools in financial distress. Traditional models like Logistic Regression, LDA, and QDA are compared against advanced methods such as Random Forest, XGBoost, LightGBM, CatBoost, and Neural Networks. The research highlights the importance of feature engineering and the application of metrics like F1-Score and ROC AUC to evaluate model performance on an imbalanced dataset.
This thesis investigates the effectiveness of machine learning models in predicting bankruptcy among Italian companies, leveraging financial ratios derived from balance sheets, income statements, and cash flow statements. By analyzing data from 255,919 companies, including 28,360 bankrupt firms, spanning 1996 to 2023, this study addresses the critical need for accurate predictive tools in financial distress. Traditional models like Logistic Regression, LDA, and QDA are compared against advanced methods such as Random Forest, XGBoost, LightGBM, CatBoost, and Neural Networks. The research highlights the importance of feature engineering and the application of metrics like F1-Score and ROC AUC to evaluate model performance on an imbalanced dataset.
Bankruptcy Prediction Models: A Comparative Analysis
TORKZABAN, MOHAMMADMAHDI
2024/2025
Abstract
This thesis investigates the effectiveness of machine learning models in predicting bankruptcy among Italian companies, leveraging financial ratios derived from balance sheets, income statements, and cash flow statements. By analyzing data from 255,919 companies, including 28,360 bankrupt firms, spanning 1996 to 2023, this study addresses the critical need for accurate predictive tools in financial distress. Traditional models like Logistic Regression, LDA, and QDA are compared against advanced methods such as Random Forest, XGBoost, LightGBM, CatBoost, and Neural Networks. The research highlights the importance of feature engineering and the application of metrics like F1-Score and ROC AUC to evaluate model performance on an imbalanced dataset.File | Dimensione | Formato | |
---|---|---|---|
Thesis.pdf
accesso aperto
Dimensione
5.24 MB
Formato
Adobe PDF
|
5.24 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/83089