This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.

This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.

Customer Segmentation Using Clustering Algorithms: A comparative study for an Italian Fashion Retail Company.

CIFUENTES BOHORQUEZ, SULY VANNESA
2024/2025

Abstract

This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.
2024
Customer Segmentation Using Clustering Algorithms: A comparative study for an Italian Fashion Retail Company.
This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.
Customer Segmentatio
Clustering Algorithm
RFMP model
File in questo prodotto:
File Dimensione Formato  
CifuentesBohorquez_SulyVannesa.pdf

accesso riservato

Dimensione 2.68 MB
Formato Adobe PDF
2.68 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/82084