This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.
This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.
Enhancing Default Prediction with Mixed High-Frequency Market Data
TROGU, SOFIA POPE
2023/2024
Abstract
This work, developed in collaboration with Rozes S.r.l., proposes a novel hybrid approach to predicting company bankruptcy in Italy at least two years prior to the event. Traditional default prediction methods primarily rely on either market-based information (contingent-claim analysis) or financial statement data (accounting-based bankruptcy prediction). While market-based information is often limited to large, publicly-held companies, financial statement data can be used for a broader range of firms. Recently, researchers have adopted machine learning models for the binary classification task of determining whether a company will become default or stay solvent within a specified horizon. This study enhances these existing default-prediction algorithms by incorporating macroeconomic and commodity pricing data, providing additional context to a company's financial health and potentially improving the algorithm's predictive power. To address the challenge of using mixed frequency data—combining low-frequency accounting ratios with high-frequency market data—this work employs the statistical method called Mixed-Data Sampling (MIDAS). With the alignment of data frequencies across all features, default prediction is tested using an XGBoost model, a powerful and efficient implementation of gradient boosting that is designed for speed and performance. The results suggest that integrating external market data with traditional accounting information significantly enhances the performance and explainability of bankruptcy prediction models. While the effectiveness of using machine learning and accounting data to predict defaults is well-documented in the literature, our research highlights a substantial improvement in model performance through the inclusion of market-based data. This combined approach represents a notable advancement in predictive accuracy and model insights. Overall, this thesis provides further insight into the flexibility of default prediction models, encouraging future development in the field of risk assessment.File | Dimensione | Formato | |
---|---|---|---|
Data_Science_MsC_Thesis_Sofia_Pope_Trogu.pdf
accesso riservato
Dimensione
1.77 MB
Formato
Adobe PDF
|
1.77 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/71037