This thesis focuses on the implementation of Data Mining and Machine Learning techniques for predicting the best position of footballers in the pitch. The dataset used to this aim has been created starting from the characteristics about professionals football players available from FIFA22 videogame data. Since the position held by footballers in the pitch has different levels, classification instruments have been used for data analysis and predictions. Data Mining techniques, including Multinomial Logistic Regression, Discriminant Analysis, and regularization methods, such as Ridge Regression and Lasso, have been adopted for the aim of discovering relationships between the predictors. Machine Learning techniques used mainly for predictions purposes include Decision Tree, Random Forest, k-Nearest Neighbour, Naive Bayes, and Support Vector Machine. In addition to that, the reduction of the response variable classes is considered to check possible improvements on the best Data Mining and Machine Learning techniques found. A comparison between the methods in terms of performance, accuracy of the results and computational cost concludes the analysis.
Prediction of football players’ position using Data Mining and Machine Learning techniques
GOBBO, ALBERTO
2022/2023
Abstract
This thesis focuses on the implementation of Data Mining and Machine Learning techniques for predicting the best position of footballers in the pitch. The dataset used to this aim has been created starting from the characteristics about professionals football players available from FIFA22 videogame data. Since the position held by footballers in the pitch has different levels, classification instruments have been used for data analysis and predictions. Data Mining techniques, including Multinomial Logistic Regression, Discriminant Analysis, and regularization methods, such as Ridge Regression and Lasso, have been adopted for the aim of discovering relationships between the predictors. Machine Learning techniques used mainly for predictions purposes include Decision Tree, Random Forest, k-Nearest Neighbour, Naive Bayes, and Support Vector Machine. In addition to that, the reduction of the response variable classes is considered to check possible improvements on the best Data Mining and Machine Learning techniques found. A comparison between the methods in terms of performance, accuracy of the results and computational cost concludes the analysis.File | Dimensione | Formato | |
---|---|---|---|
Gobbo_Alberto.pdf
accesso aperto
Dimensione
6.69 MB
Formato
Adobe PDF
|
6.69 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/43111