In industrial processes such as injection molding, the rapid and precise detection of defects is crucial for maintaining product quality and reducing production costs. However, in many scenarios for identifying industrial defects, there is still reliance on image processing techniques, which require extensive expertise to calibrate due to the presence of numerous thresholds. This makes the task of quality monitoring particularly complex and expensive, as these values are specific to each individual component/environment. Therefore, the study aims to identify an object detection model for real-time recognition of typical surface defects in injection molding. Based on standard metrics, it was decided to compare the two algorithms that are currently considered state-of-the-art in the field of object detection but possess different architectures: YOLOv8, which uses the well-established CNNs, and RT-DETR, which combines the operation principles of CNNs with the more recent Vision Transformers. After creating a dataset on various types of defects, two models based on the two algorithms were developed, and it emerged that RT-DETR outperforms YOLOv8 in all considered metrics, achieving a mAP50 of over 80%. By also analyzing the behavior of the model created with RT-DETR on collected videos, its robustness was appreciable, as it detected the defect in 100% of the cases for each component. After determining which of the two algorithms is most suitable in this application field, the question arose whether knowledge transfer can be exploited between different components that present the same type of defect. This possibility is crucial in a field such as injection molding, where there can be many types of different components, each requiring time to collect data, annotate it, and train the model. Therefore, after selecting a component, two models were compared: one trained from scratch only on that component, and one pre-trained on another component with the same defect. The comparison showed that the pre-trained model can leverage prior knowledge and achieve good performance with only a few images. In the end, the importance of the tool used to collect data was addressed: does using a tool that allows for higher-quality image acquisition for model training necessarily lead to increased performance? The results showed that using a reflex camera capable of taking ultra-high-resolution photos, compared to using a simple webcam, does not significantly improve metrics such as precision and recall, but helps increase the model's mAP.
Nei processi industriali come lo stampaggio ad iniezione, la rapida e precisa individuazione dei difetti è cruciale per mantenere la qualità del prodotto e ridurre i costi di produzione. Tuttavia, in molti scenari per l’identificazione dei difetti industriali, si fa ancora affidamento sulle tecniche di elaborazione delle immagini, che richiedono una vasta esperienza per essere tarate per la presenza di numerose soglie. Ciò rende particolarmente complesso e costoso il compito del monitoraggio della qualità, poichè questi valori sono specifici per ogni singolo componente/ambiente. Pertanto, lo studio mira a identificare un modello di object detection per il riconoscimento in tempo reale dei difetti superficiali tipici nello stampaggio ad iniezione. Inizialmente, sulla base di metriche standard, si sono confrontati i due algoritmi che attualmente rappresentano lo stato dell’arte nel campo dell’object detection, ma che possiedono al loro interno architetture diverse: YOLOv8, che utilizza le ormai consolidate CNN, ed RT-DETR che unisce il principio di funzionamento delle CNN a quello dei più recenti Vision Transformer. Dopo aver creato un dataset su diverse tipologie di difetti, si sono creati due modelli basati sui due algoritmi, e ne è emerso che RT-DETR outperforma YOLOv8 in tutte le metriche considerate, ottenendo una mAP50 superiore all’80%. Analizzando anche il comportamento del modello creato con RT-DETR sui video raccolti, è stato possibile apprezzarne la robustezza, poichè per ciascun componente ha rilevato il difetto nel 100% dei casi. Dopo aver constatato quale dei due algoritmi sia il più adatto in questo campo applicativo, ci si è posti il problema di capire se può essere sfruttato il trasferimento di conoscenza tra componenti diversi, che presentano la stessa tipologia di difetto. Questa possibilità è cruciale in un campo come quello dello stampaggio ad iniezione, in cui possono esserci moltissime tipologie di componenti diversi, ciascuno dei quali richiede tempo per raccogliere dati, annotarli ed addestrare il modello. Preso quindi un componente, si sono confrontati due modelli, uno addestrato da zero solo su quel componente, ed uno pre-addestrato su un altro componente con lo stesso difetto. Dal confronto è emerso che il modello pre-addestrato riesce a sfruttare la conoscenza pregressa e ad avere buone prestazioni già con poche immagini. Come ultimo punto, si è affrontato il discorso dell’importanza dello strumento usato per raccogliere i dati: utilizzare uno strumento che consenta di acquisire immagini di qualità più elevata per l’addestramento del modello, ne comporta necessariamente un aumento delle performance? Dai risultati ottenuti è emerso che l’utilizzo di una reflex in grado di fare foto ad altissima risoluzione, confrontato con l’utilizzo di una semplice webcam non porta significativi miglioramenti in metriche come precision e recall, ma aiuta ad aumentare la mAP del modello.
Confronto di algoritmi e metodi nel rilevamento difetti con visione artificiale su componenti stampati a iniezione.
BRESOLIN, SEBASTIANO
2023/2024
Abstract
In industrial processes such as injection molding, the rapid and precise detection of defects is crucial for maintaining product quality and reducing production costs. However, in many scenarios for identifying industrial defects, there is still reliance on image processing techniques, which require extensive expertise to calibrate due to the presence of numerous thresholds. This makes the task of quality monitoring particularly complex and expensive, as these values are specific to each individual component/environment. Therefore, the study aims to identify an object detection model for real-time recognition of typical surface defects in injection molding. Based on standard metrics, it was decided to compare the two algorithms that are currently considered state-of-the-art in the field of object detection but possess different architectures: YOLOv8, which uses the well-established CNNs, and RT-DETR, which combines the operation principles of CNNs with the more recent Vision Transformers. After creating a dataset on various types of defects, two models based on the two algorithms were developed, and it emerged that RT-DETR outperforms YOLOv8 in all considered metrics, achieving a mAP50 of over 80%. By also analyzing the behavior of the model created with RT-DETR on collected videos, its robustness was appreciable, as it detected the defect in 100% of the cases for each component. After determining which of the two algorithms is most suitable in this application field, the question arose whether knowledge transfer can be exploited between different components that present the same type of defect. This possibility is crucial in a field such as injection molding, where there can be many types of different components, each requiring time to collect data, annotate it, and train the model. Therefore, after selecting a component, two models were compared: one trained from scratch only on that component, and one pre-trained on another component with the same defect. The comparison showed that the pre-trained model can leverage prior knowledge and achieve good performance with only a few images. In the end, the importance of the tool used to collect data was addressed: does using a tool that allows for higher-quality image acquisition for model training necessarily lead to increased performance? The results showed that using a reflex camera capable of taking ultra-high-resolution photos, compared to using a simple webcam, does not significantly improve metrics such as precision and recall, but helps increase the model's mAP.File | Dimensione | Formato | |
---|---|---|---|
Bresolin_Sebastiano.pdf
accesso riservato
Dimensione
3.45 MB
Formato
Adobe PDF
|
3.45 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/66253