Automatic model selection for time series forecasting: a sales data application

Demand forecasting in retail is increasingly relevant due to rising data availability and the growing expectation that artificial intelligence can make forecasting accessible to non-expert users. This thesis combines a theoretical foundation for time series analysis with a methodology that enables practitioners to generate reliable product-demand forecasts using only historical sales data extracted from invoicing systems. It addresses both the lack of analytical expertise and the heterogeneity of real-world time series. Throughout this work, the focus is on prioritizing simplicity, interpretability, and computational feasibility with the aim of enabling implementation in environments such as SQL. The proposed approach classifies time series by trend, seasonality, intermittency, and demand-size variability, and uses this structure to compare different forecasting models through a comprehensive walk-forward evaluation. The results identify the best-performing techniques for each class and show that no single technique is optimal across all categories: exponential smoothing performs competitively when a trend is present, Croston-type methods are effective for highly intermittent demand, and deep learning models achieve strong accuracy but require excessive computational resources. These findings support the design of an automated model selection pipeline that recommends appropriate forecasting models and provides interpretable reliability indicators for end users. The methodology, validated on sales data from a food and beverage company, offers a reproducible and operational framework that other organizations can adopt to improve forecasting efficiency. If provided by the company, future research could focus on integrating domain knowledge and incorporating exogenous factors to further enhance forecasting performance and decision-making usefulness.

Automatic model selection for time series forecasting: a sales data application

LAUDITI, CHIARA

2024/2025

Abstract

Demand forecasting in retail is increasingly relevant due to rising data availability and the growing expectation that artificial intelligence can make forecasting accessible to non-expert users. This thesis combines a theoretical foundation for time series analysis with a methodology that enables practitioners to generate reliable product-demand forecasts using only historical sales data extracted from invoicing systems. It addresses both the lack of analytical expertise and the heterogeneity of real-world time series. Throughout this work, the focus is on prioritizing simplicity, interpretability, and computational feasibility with the aim of enabling implementation in environments such as SQL. The proposed approach classifies time series by trend, seasonality, intermittency, and demand-size variability, and uses this structure to compare different forecasting models through a comprehensive walk-forward evaluation. The results identify the best-performing techniques for each class and show that no single technique is optimal across all categories: exponential smoothing performs competitively when a trend is present, Croston-type methods are effective for highly intermittent demand, and deep learning models achieve strong accuracy but require excessive computational resources. These findings support the design of an automated model selection pipeline that recommends appropriate forecasting models and provides interpretable reliability indicators for end users. The methodology, validated on sales data from a food and beverage company, offers a reproducible and operational framework that other organizations can adopt to improve forecasting efficiency. If provided by the company, future research could focus on integrating domain knowledge and incorporating exogenous factors to further enhance forecasting performance and decision-making usefulness.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE  Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Automatic model selection for time series forecasting: a sales data application
			
	Abstract in italiano
	
				Demand forecasting in retail is increasingly relevant due to rising data availability and the growing expectation that artificial intelligence can make forecasting accessible to non-expert users. 

This thesis combines a theoretical foundation for time series analysis with a methodology that enables practitioners to generate reliable product-demand forecasts using only historical sales data extracted from invoicing systems. It addresses both the lack of analytical expertise and the heterogeneity of real-world time series. Throughout this work, the focus is on prioritizing simplicity, interpretability, and computational feasibility with the aim of enabling implementation in environments such as SQL.

The proposed approach classifies time series by trend, seasonality, intermittency, and demand-size variability, and uses this structure to compare different forecasting models through a comprehensive walk-forward evaluation. The results identify the best-performing techniques for each class and show that no single technique is optimal across all categories: exponential smoothing performs competitively when a trend is present, Croston-type methods are effective for highly intermittent demand, and deep learning models achieve strong accuracy but require excessive computational resources.

These findings support the design of an automated model selection pipeline that recommends appropriate forecasting models and provides interpretable reliability indicators for end users. The methodology, validated on sales data from a food and beverage company, offers a reproducible and operational framework that other organizations can adopt to improve forecasting efficiency. If provided by the company, future research could focus on integrating domain knowledge and incorporating exogenous factors to further enhance forecasting performance and decision-making usefulness.
			
	Parola chiave
	
				Time series forecast
State space models
ARIMA
Time series taxonomy
			
	Relatore
	
				GUIDOLIN, MARIANGELA
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Tesi_Lauditi.pdf Accesso riservato Dimensione 3.31 MB Formato Adobe PDF	3.31 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/102118