This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.

This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.

Enhancing Default Prediction with Mixed High-Frequency Market Data

TROGU, SOFIA POPE
2023/2024

Abstract

This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.
2023
Enhancing Default Prediction with Mixed High-Frequency Market Data
This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.
default prediction
machine learning
mixed-data frequency
File in questo prodotto:
File Dimensione Formato  
Data_Science_MsC_Thesis_Sofia_Pope_Trogu.pdf

accesso riservato

Dimensione 1.77 MB
Formato Adobe PDF
1.77 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/71037