Hybrid Temporal Convolutional Networks and Feature Engineering for Retail Demand Forecasting

This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques. The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering. The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics. Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance. Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects. In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.

Hybrid Temporal Convolutional Networks and Feature Engineering for Retail Demand Forecasting

LA DELFA, NICCOLÒ

2024/2025

Abstract

This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques. The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering. The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics. Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance. Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects. In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Fisica e Astronomia "Galileo Galilei" - DFA
			
	Corso di studio
	
				PHYSICS OF DATA Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Hybrid Temporal Convolutional Networks and Feature Engineering for Retail Demand Forecasting
			
	Abstract in italiano
	
				This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques.  

The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering.  

The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics.  

Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance.  

Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects.  

In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.
			
	Parola chiave
	
				Hybrid Temporal Conv
Feature Engineering
Retail Demand Forec
			
	Relatore
	
				ZANETTI, MARCO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
La_Delfa_Niccolò.pdf accesso aperto Dimensione 1.09 MB Formato Adobe PDF Visualizza/Apri	1.09 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/100374