This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques. The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering. The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics. Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance. Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects. In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.

This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques. The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering. The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics. Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance. Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects. In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.

Hybrid Temporal Convolutional Networks and Feature Engineering for Retail Demand Forecasting

LA DELFA, NICCOLÒ
2024/2025

Abstract

This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques. The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering. The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics. Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance. Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects. In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.
2024
Hybrid Temporal Convolutional Networks and Feature Engineering for Retail Demand Forecasting
This thesis investigates short-term retail demand forecasting through the comparative study of two distinct modeling paradigms: deep sequence models and gradient boosting methods. Retail demand is inherently complex, characterized by strong non-linear temporal dependencies, multiple seasonalities, and irregular fluctuations due to promotions and calendar effects. These features present significant challenges for classical statistical approaches, motivating the exploration of modern machine learning techniques. The first approach employs a Temporal Convolutional Network (TCN), a deep neural architecture based on dilated causal convolutions and residual connections. This model is designed to efficiently capture long-range dependencies and hierarchical temporal structures directly from raw sequences, without the need for extensive manual feature engineering. The second approach utilizes Extreme Gradient Boosting (XGBoost), a tree-based ensemble method applied to a rich set of handcrafted predictors. These include lagged variables, rolling statistics, STL-based seasonal-trend decomposition, Fourier expansions for periodic components, and detailed calendar features. By transforming the time series forecasting task into a supervised regression framework, XGBoost exploits structured and domain-informed representations of demand dynamics. Both models were trained and tuned within a rigorous experimental pipeline. For TCNs, grid search was conducted across loss functions, optimizers, batch sizes, and learning rates. For XGBoost, Bayesian optimization with Optuna was employed to balance predictive accuracy and model complexity. Evaluation relied on walk-forward validation, ensuring temporally consistent and realistic assessment of predictive performance. Experimental results show that the two approaches offer complementary strengths. The TCN excels in learning complex sequential dependencies and adapting to non-stationary patterns, while XGBoost achieves strong accuracy with high interpretability through feature importance analysis. Both models achieved low Mean Absolute Percentage Error (MAPE) and robustness against demand irregularities, though their performance varies depending on the forecasting horizon and the dominant seasonal effects. In conclusion, this thesis provides a systematic comparison of deep sequence modeling and feature-based boosting for retail demand forecasting. By analyzing their respective advantages and limitations, it contributes methodological insights and practical guidance for the deployment of machine learning models in retail supply chain optimization.
Hybrid Temporal Conv
Feature Engineering
Retail Demand Forec
File in questo prodotto:
File Dimensione Formato  
La_Delfa_Niccolò.pdf

accesso aperto

Dimensione 1.09 MB
Formato Adobe PDF
1.09 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/100374