Minimum number of data points estimation for supervised ML problems

“How many data points are needed to fit a machine learning model?”. This question is ubiquitous in the planning of a data science project. An estimation of the appropriate sample size is crucial in terms of time, resource allocation, and model quality. Literature review shows a lack of effective methods to address the problem. With this thesis, we propose an original work that does not suffer from the main limitations of the other available approaches. It is based on a metamodel that predicts the minimum number of points required depending on the characteristics of the dataset under consideration. Moreover, we specify what conditions have to be met in the planning of a project before the above question can be formulated.

Minimum number of data points estimation for supervised ML problems

POZZAN, MATTEO

2021/2022

Abstract

“How many data points are needed to fit a machine learning model?”. This question is ubiquitous in the planning of a data science project. An estimation of the appropriate sample size is crucial in terms of time, resource allocation, and model quality. Literature review shows a lack of effective methods to address the problem. With this thesis, we propose an original work that does not suffer from the main limitations of the other available approaches. It is based on a metamodel that predicts the minimum number of points required depending on the characteristics of the dataset under consideration. Moreover, we specify what conditions have to be met in the planning of a project before the above question can be formulated.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2021
			
	Titolo inglese
	
				Minimum number of data points estimation for supervised ML problems
			
	Parola chiave
	
				Machine Learning
Supervised Learning
Meta Learning
Design of Experiment
Learning Curves
			
	Relatore
	
				ROSSI, MICHELE
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Matteo_Pozzan.pdf accesso riservato Dimensione 2.15 MB Formato Adobe PDF	2.15 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/42069