A Feature Engineering-Centric Approach to Machine Learning for Anti-Money Laundering Systems

Financial fraud detection has been extensively explored using machine learning techniques, particularly in different areas such as credit card fraud, insurance fraud, and money laundering. However, detecting different forms of fraud presents distinct challenges, especially from a financial and regulatory perspective. This thesis critically reviews the current literature, identifying key limitations and pitfalls through the lens of financial domain requirements. These include the need to properly handle temporal fraud patterns, address class imbalance in datasets, and en- sure transparency and explainability in the models used-factors that are essential for compliance with financial regulations. Focusing on the anti-money laundering domain, this work proposes a behavioral feature engineering-centric approach that focuses on capturing historical transaction patterns (e.g., av- eraged amounts, different currency types utilized, and transaction frequency), that is applied on three publicly available synthetic datasets-SAML-D, IBM HI, and LI small-and that enable the performance of simpler and interpretable models, such as Decision Tree, Naive Bayes, Logistic Regression and XGBoost. Each model is evaluated using multiple performance metrics, including average F1 score, recall, and false positive rate. To better understand the contribution of different behavioral signals, models are trained using both the full set of engineered features and specific individual feature subsets. This allows for an assessment of which types of behavioral features most effectively support model learning. The core objective is to demonstrate that domain-specific feature engineering, that has been designed to replicate the graph-like structure of money laundering data-can enable simple, interpretable models to achieve or even outperform the performance of more complex and non-transparent architectures such as Graph Neural Net- works and Transformers found as state-of-the-art. The results show that XGBoost, when trained on the complete set of features, outperforms current GNN-based and some Transformer-based approaches across all evaluated datasets. Furthermore, each model has been evaluated across the different type of money laundering fraud provide in each dataset, resulting that XGBoost trained on all set of features is actually able to spot the different structures of fraud type. Similarly, Decision Tree models achieve competitive results, reinforcing the viability of interpretable models when guided by carefully designed, domain-informed features. Overall, this thesis demonstrates that behavior-driven, transparent machine learning approaches offer an effective and regulation-aligned alternative to black-box models in AML systems. While the findings are promising, future work should explore scalability to larger datasets, graph-based feature augmentation, and deeper integration of explainable AI techniques.

Financial fraud detection has been extensively explored using machine learning techniques, particularly in different areas such as credit card fraud, insurance fraud, and money laundering. However, detecting different forms of fraud presents distinct challenges, especially from a financial and regulatory perspective. This thesis critically reviews the current literature, identifying key limitations and pitfalls through the lens of financial domain requirements. These include the need to properly handle temporal fraud patterns, address class imbalance in datasets, and en- sure transparency and explainability in the models used-factors that are essential for compliance with financial regulations. Focusing on the anti-money laundering domain, this work proposes a behavioral feature engineering-centric approach that focuses on capturing historical transaction patterns (e.g., averaged amounts, different currency types utilized, and transaction frequency), that is applied on three publicly available synthetic datasets-SAML-D, IBM HI, and LI small-and that enable the performance of simpler and interpretable models, such as Decision Tree, Naive Bayes, Logistic Regression and XGBoost. Each model is evaluated using multiple performance metrics, including average F1 score, recall, and false positive rate. To better understand the contribution of different behavioral signals, models are trained using both the full set of engineered features and specific individual feature subsets. This allows for an assessment of which types of behavioral features most effectively support model learning. The core objective is to demonstrate that domain-specific feature engineering, that has been designed to replicate the graph-like structure of money laundering data-can enable simple, interpretable models to achieve or even outperform the performance of more complex and non-transparent architectures such as Graph Neural Net- works and Transformers found as state-of-the-art. The results show that XGBoost, when trained on the complete set of features, outperforms current GNN-based and some Transformer-based approaches across all evaluated datasets. Furthermore, each model has been evaluated across the different type of money laundering fraud provide in each dataset, resulting that XGBoost trained on all set of features is actually able to spot the different structures of fraud type. Simi- larly, Decision Tree models achieve competitive results, reinforcing the viability of interpretable models when guided by carefully designed, domain-informed features. Overall, this thesis demonstrates that behavior-driven, transparent machine learning approaches offer an effective and regulation-aligned alternative to black-box models in AML systems. While the findings are promising, future work should explore scalability to larger datasets, graph-based feature augmentation, and deeper integration of explainable AI techniques.