This thesis investigates how Data Science can enhance the underwriting process in life insurance, through a case study on Dr. Mouse, an AI-based virtual underwriter developed by Generali Italia. Underwriting plays a central role in assessing the risk profile of applicants, particularly for protection products such as Term Life and Long-Term Care. While most applications can be processed through standard business rules, more complex or ambiguous cases—often involving unstructured medical documentation or free-text health disclosures—still require manual review. The research was conducted during my internship at Generali Italia, where I worked at the intersection of business and data functions. Specifically, I collaborated with both the Chief Life Office, including the underwriting and policy issuance teams, and the Data Office, within the Advanced Analytics Models team—comprising data scientists and engineers—where I now serve as a Data Scientist. This setting provided a unique perspective on both the operational constraints and the technical opportunities in automating risk selection. Dr. Mouse tackles the challenge by automating the extraction and interpretation of health-related information from scanned medical reports, blood tests, and free-text answers. Its architecture combines OCR, BERT-based document classification, ML- and rule-based data extraction, and a risk prediction model built with XGBoost and SHAP explainability. Fully integrated into Generali’s digital sales platform, the system supports underwriters in making more consistent, transparent, and timely decisions. The thesis describes the end-to-end pipeline, evaluates model performance, and discusses challenges such as document heterogeneity and limited training data. It also explores ongoing developments involving Generative AI, which are showing promising results in handling complex documents like oncological and cardiological reports—historically difficult for traditional approaches. Ultimately, this work illustrates how intelligent automation can extend underwriting capabilities, reduce operational burden, and pave the way for more scalable and customer-friendly insurance processes.

This thesis investigates how Data Science can enhance the underwriting process in life insurance, through a case study on Dr. Mouse, an AI-based virtual underwriter developed by Generali Italia. Underwriting plays a central role in assessing the risk profile of applicants, particularly for protection products such as Term Life and Long-Term Care. While most applications can be processed through standard business rules, more complex or ambiguous cases—often involving unstructured medical documentation or free-text health disclosures—still require manual review. The research was conducted during my internship at Generali Italia, where I worked at the intersection of business and data functions. Specifically, I collaborated with both the Chief Life Office, including the underwriting and policy issuance teams, and the Data Office, within the Advanced Analytics Models team—comprising data scientists and engineers—where I now serve as a Data Scientist. This setting provided a unique perspective on both the operational constraints and the technical opportunities in automating risk selection. Dr. Mouse tackles the challenge by automating the extraction and interpretation of health-related information from scanned medical reports, blood tests, and free-text answers. Its architecture combines OCR, BERT-based document classification, ML- and rule-based data extraction, and a risk prediction model built with XGBoost and SHAP explainability. Fully integrated into Generali’s digital sales platform, the system supports underwriters in making more consistent, transparent, and timely decisions. The thesis describes the end-to-end pipeline, evaluates model performance, and discusses challenges such as document heterogeneity and limited training data. It also explores ongoing developments involving Generative AI, which are showing promising results in handling complex documents like oncological and cardiological reports—historically difficult for traditional approaches. Ultimately, this work illustrates how intelligent automation can extend underwriting capabilities, reduce operational burden, and pave the way for more scalable and customer-friendly insurance processes.

The Generali Case: Deploying Data Science for Risk Selection in Life Insurance Underwriting

KACI, FLAVIO
2024/2025

Abstract

This thesis investigates how Data Science can enhance the underwriting process in life insurance, through a case study on Dr. Mouse, an AI-based virtual underwriter developed by Generali Italia. Underwriting plays a central role in assessing the risk profile of applicants, particularly for protection products such as Term Life and Long-Term Care. While most applications can be processed through standard business rules, more complex or ambiguous cases—often involving unstructured medical documentation or free-text health disclosures—still require manual review. The research was conducted during my internship at Generali Italia, where I worked at the intersection of business and data functions. Specifically, I collaborated with both the Chief Life Office, including the underwriting and policy issuance teams, and the Data Office, within the Advanced Analytics Models team—comprising data scientists and engineers—where I now serve as a Data Scientist. This setting provided a unique perspective on both the operational constraints and the technical opportunities in automating risk selection. Dr. Mouse tackles the challenge by automating the extraction and interpretation of health-related information from scanned medical reports, blood tests, and free-text answers. Its architecture combines OCR, BERT-based document classification, ML- and rule-based data extraction, and a risk prediction model built with XGBoost and SHAP explainability. Fully integrated into Generali’s digital sales platform, the system supports underwriters in making more consistent, transparent, and timely decisions. The thesis describes the end-to-end pipeline, evaluates model performance, and discusses challenges such as document heterogeneity and limited training data. It also explores ongoing developments involving Generative AI, which are showing promising results in handling complex documents like oncological and cardiological reports—historically difficult for traditional approaches. Ultimately, this work illustrates how intelligent automation can extend underwriting capabilities, reduce operational burden, and pave the way for more scalable and customer-friendly insurance processes.
2024
The Generali Case: Deploying Data Science for Risk Selection in Life Insurance Underwriting
This thesis investigates how Data Science can enhance the underwriting process in life insurance, through a case study on Dr. Mouse, an AI-based virtual underwriter developed by Generali Italia. Underwriting plays a central role in assessing the risk profile of applicants, particularly for protection products such as Term Life and Long-Term Care. While most applications can be processed through standard business rules, more complex or ambiguous cases—often involving unstructured medical documentation or free-text health disclosures—still require manual review. The research was conducted during my internship at Generali Italia, where I worked at the intersection of business and data functions. Specifically, I collaborated with both the Chief Life Office, including the underwriting and policy issuance teams, and the Data Office, within the Advanced Analytics Models team—comprising data scientists and engineers—where I now serve as a Data Scientist. This setting provided a unique perspective on both the operational constraints and the technical opportunities in automating risk selection. Dr. Mouse tackles the challenge by automating the extraction and interpretation of health-related information from scanned medical reports, blood tests, and free-text answers. Its architecture combines OCR, BERT-based document classification, ML- and rule-based data extraction, and a risk prediction model built with XGBoost and SHAP explainability. Fully integrated into Generali’s digital sales platform, the system supports underwriters in making more consistent, transparent, and timely decisions. The thesis describes the end-to-end pipeline, evaluates model performance, and discusses challenges such as document heterogeneity and limited training data. It also explores ongoing developments involving Generative AI, which are showing promising results in handling complex documents like oncological and cardiological reports—historically difficult for traditional approaches. Ultimately, this work illustrates how intelligent automation can extend underwriting capabilities, reduce operational burden, and pave the way for more scalable and customer-friendly insurance processes.
Data Science
Insurance
Machine Learning
Risk Selection
Business application
File in questo prodotto:
File Dimensione Formato  
Master_Thesis_Flavio_Kaci_pdfa.pdf

Accesso riservato

Dimensione 1.56 MB
Formato Adobe PDF
1.56 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91832