Customer Segmentation Using Clustering Algorithms: A comparative study for an Italian Fashion Retail Company.

This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.

Customer Segmentation Using Clustering Algorithms: A comparative study for an Italian Fashion Retail Company.

CIFUENTES BOHORQUEZ, SULY VANNESA

2024/2025

Abstract

This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				ICT FOR INTERNET AND MULTIMEDIA - INGEGNERIA PER LE COMUNICAZIONI MULTIMEDIALI E INTERNET Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Customer Segmentation Using Clustering Algorithms: A comparative study for an Italian Fashion Retail Company.
			
	Abstract in italiano
	
				This thesis aims to identify the most effective clustering algorithm for customer segmentation using the RFM+P (Recency, Frequency, Monetary, Product Variety) model, applied to over 1.5 million transaction records from an Italian retail company. RFM+P evaluates customer engagement through four dimensions: recency (time since the last purchase), frequency (number of transactions), monetary value (total spending), and product variety (range of unique products purchased). Each metric is standardized to ensure comparability, providing a comprehensive understanding of customer behavior. This methodology enables businesses to segment their customer base and tailor marketing strategies, optimizing resource allocation and enhancing customer retention and lifetime value. The dataset was analyzed using three clustering algorithms: K-Means, Gaussian Mixture Model (GMM), and BIRCH. Visual analysis through 3D and 2D projections revealed that K-Means produced well-separated and balanced clusters, making it the most interpretable and actionable method for marketing applications. While computationally efficient, BIRCH resulted in clusters with more overlap, reducing their distinctiveness and interpretability but with a similar result to K-Means. GMM failed to form clear customer groups, as seen in the cluster distribution plots, where most data points belonged to a single cluster, limiting its usefulness for targeted marketing. Based on these results, K-Means was chosen as the optimal algorithm, allowing the company to identify high-value frequent buyers, moderately engaged customers, and dormant clients. These insights enabled the development of targeted marketing strategies, including loyalty programs, personalized promotions, and re-engagement campaigns. This study highlights how each clustering algorithm presents unique advantages and benefits, yet K-Means demonstrated the best performance for the evaluated dataset, providing well-defined and actionable customer segments that enable more effective marketing decisions and business growth.
			
	Parola chiave
	
				Customer Segmentatio
Clustering Algorithm
RFMP model
			
	Relatore
	
				ERSEGHE, TOMASO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
CifuentesBohorquez_SulyVannesa.pdf Accesso riservato Dimensione 2.68 MB Formato Adobe PDF	2.68 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/82084