The studies performed in this thesis see their light in the context of an internship carried out in Porini, a dynamic business versed in digital consulting and software development. The ultimate goal of this research is to develop an algorithm to perform product recognition of common items found in supermarkets or grocery shops. The first part of the analysis will consider a simplified toy model, in order to gain a deeper insight on the data at disposal. In particular, a manual feature extraction will be designed, consisting of an equalisation procedure and a custom-built cropping for the images. A novel classification model will be then defined using average RGB histograms as references for each product class and testing out different metrics to quantify the similarity between two images. This implementation will culminate in the realization of a proof of concept in the form of an application for mobile platforms. In the second part of the study, object detection and recognition will be tackled in a more generalized context. This will require the employment of more advanced, pre-built algorithms, particularly in the form of deep convolutional neural networks. Specifically, a focus will be made on the single-shot approach, where a duly trained detector only observes the image at once, as a whole, before outputting its detection prediction; an exploratory analysis will be performed taking advantage of the YOLO model, a state-of-the-art implementation in the field. The results obtained are very satisfactory: the first part of the study has led to the definition of a new customized algorithm for classification which is robust and well-optimized, while in the second one promising foundations have been laid in the development of advanced object recognition tools for general use cases.

Physics methods for image classification with Deep Neural Networks

Pompeo, Gianmarco
2021/2022

Abstract

The studies performed in this thesis see their light in the context of an internship carried out in Porini, a dynamic business versed in digital consulting and software development. The ultimate goal of this research is to develop an algorithm to perform product recognition of common items found in supermarkets or grocery shops. The first part of the analysis will consider a simplified toy model, in order to gain a deeper insight on the data at disposal. In particular, a manual feature extraction will be designed, consisting of an equalisation procedure and a custom-built cropping for the images. A novel classification model will be then defined using average RGB histograms as references for each product class and testing out different metrics to quantify the similarity between two images. This implementation will culminate in the realization of a proof of concept in the form of an application for mobile platforms. In the second part of the study, object detection and recognition will be tackled in a more generalized context. This will require the employment of more advanced, pre-built algorithms, particularly in the form of deep convolutional neural networks. Specifically, a focus will be made on the single-shot approach, where a duly trained detector only observes the image at once, as a whole, before outputting its detection prediction; an exploratory analysis will be performed taking advantage of the YOLO model, a state-of-the-art implementation in the field. The results obtained are very satisfactory: the first part of the study has led to the definition of a new customized algorithm for classification which is robust and well-optimized, while in the second one promising foundations have been laid in the development of advanced object recognition tools for general use cases.
2021-04
80
Image classification - Deep Neural Networks - Color analysis - Product recognition - SMEs
File in questo prodotto:
File Dimensione Formato  
Tesi-GianmarcoPompeo.pdf

accesso aperto

Dimensione 4.39 MB
Formato Adobe PDF
4.39 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/21190