In machine learning, handling uncertainty is critical for ensuring reliable decision-making in complex environments. This thesis explores the concept of learning to reject, focusing on improving confidence estimation and calibration through the application of bilevel optimization, a framework designed to solve hierarchical problems with interdependent optimization levels. We propose two novel bilevel optimization methods to train machine learning models, and we evaluate their effectiveness in refining model confidence and improving calibration. These methods are tested on toy datasets, such as Blobs and Spirals, as well as more practical datasets like Blood Alcohol Concentration (BAC) and MNIST, demonstrating their potential to offer a more meaningful and reliable measure of uncertainty compared to traditional approaches. The experimental results reveal that bilevel methods outperform conventional training in confidence calibration metrics, including Expected Calibration Error (ECE), Negative Log-Likelihood (NLL), and AUROC, though with a slight trade-off in accuracy. Furthermore, the thesis compares bilevel optimization methods with post-calibration techniques, highlighting their advantages in producing trustworthy confidence estimates without the risk of over-adjustment seen in some post-calibration methods. This work lays the groundwork for future research on bilevel optimization in reject-option classification and selective classification, offering a promising step toward creating safer, more robust AI systems capable of understanding their own limits.
Learning to Reject within a Bilevel Optimization Framework
SANGUIN, GABRIELE
2023/2024
Abstract
In machine learning, handling uncertainty is critical for ensuring reliable decision-making in complex environments. This thesis explores the concept of learning to reject, focusing on improving confidence estimation and calibration through the application of bilevel optimization, a framework designed to solve hierarchical problems with interdependent optimization levels. We propose two novel bilevel optimization methods to train machine learning models, and we evaluate their effectiveness in refining model confidence and improving calibration. These methods are tested on toy datasets, such as Blobs and Spirals, as well as more practical datasets like Blood Alcohol Concentration (BAC) and MNIST, demonstrating their potential to offer a more meaningful and reliable measure of uncertainty compared to traditional approaches. The experimental results reveal that bilevel methods outperform conventional training in confidence calibration metrics, including Expected Calibration Error (ECE), Negative Log-Likelihood (NLL), and AUROC, though with a slight trade-off in accuracy. Furthermore, the thesis compares bilevel optimization methods with post-calibration techniques, highlighting their advantages in producing trustworthy confidence estimates without the risk of over-adjustment seen in some post-calibration methods. This work lays the groundwork for future research on bilevel optimization in reject-option classification and selective classification, offering a promising step toward creating safer, more robust AI systems capable of understanding their own limits.File | Dimensione | Formato | |
---|---|---|---|
Sanguin_Data_Science_Dissertation.pdf
accesso aperto
Dimensione
3.04 MB
Formato
Adobe PDF
|
3.04 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/71034