Automated food waste monitoring in institutional catering environments such as university canteens requires robust visual recognition under severe class imbalance, high visual similarity, and partial consumption. This thesis investigates instance segmentation for food waste analysis using tray images collected after meal consumption in a university canteen. The dataset comprises 185 semantic classes, including prepared dishes, raw food items, non-edible objects, and an explicit background category representing unmatched detections. A YOLO-based instance segmentation model is trained and evaluated using standard object detection and segmentation metrics, including mean Average Precision (mAP), alongside class-level Precision, Recall, and F1-score derived from confusion matrices. To reflect different operational assumptions, performance is analyzed under three evaluation modes: strict closed-set, penalized closed-set, and end-to-end evaluation. In addition, a hierarchical evaluation framework is introduced to assess model behavior at multiple semantic resolutions without retraining. Results show strong performance for frequent classes and coarse semantic categories, while performance degrades for rare labels and in detection-aware settings that penalize missed and spurious detections. Hierarchical analysis reveals that many fine-grained errors collapse into correct predictions at higher abstraction levels, indicating that most failures stem from intrinsic visual ambiguity in tray-leftover imagery rather than model instability. These findings highlight both the potential and the limitations of vision-only food waste recognition and motivate frequency-aware and hierarchy-based evaluation for realistic system assessment.

Automated food waste monitoring in institutional catering environments such as university canteens requires robust visual recognition under severe class imbalance, high visual similarity, and partial consumption. This thesis investigates instance segmentation for food waste analysis using tray images collected after meal consumption in a university canteen. The dataset comprises 185 semantic classes, including prepared dishes, raw food items, non-edible objects, and an explicit background category representing unmatched detections. A YOLO-based instance segmentation model is trained and evaluated using standard object detection and segmentation metrics, including mean Average Precision (mAP), alongside class-level Precision, Recall, and F1-score derived from confusion matrices. To reflect different operational assumptions, performance is analyzed under three evaluation modes: strict closed-set, penalized closed-set, and end-to-end evaluation. In addition, a hierarchical evaluation framework is introduced to assess model behavior at multiple semantic resolutions without retraining. Results show strong performance for frequent classes and coarse semantic categories, while performance degrades for rare labels and in detection-aware settings that penalize missed and spurious detections. Hierarchical analysis reveals that many fine-grained errors collapse into correct predictions at higher abstraction levels, indicating that most failures stem from intrinsic visual ambiguity in tray-leftover imagery rather than model instability. These findings highlight both the potential and the limitations of vision-only food waste recognition and motivate frequency-aware and hierarchy-based evaluation for realistic system assessment.

Vision-Based Monitoring of Post-Consumer Food Waste in a University Canteen

RAZZAGHI, ALI
2025/2026

Abstract

Automated food waste monitoring in institutional catering environments such as university canteens requires robust visual recognition under severe class imbalance, high visual similarity, and partial consumption. This thesis investigates instance segmentation for food waste analysis using tray images collected after meal consumption in a university canteen. The dataset comprises 185 semantic classes, including prepared dishes, raw food items, non-edible objects, and an explicit background category representing unmatched detections. A YOLO-based instance segmentation model is trained and evaluated using standard object detection and segmentation metrics, including mean Average Precision (mAP), alongside class-level Precision, Recall, and F1-score derived from confusion matrices. To reflect different operational assumptions, performance is analyzed under three evaluation modes: strict closed-set, penalized closed-set, and end-to-end evaluation. In addition, a hierarchical evaluation framework is introduced to assess model behavior at multiple semantic resolutions without retraining. Results show strong performance for frequent classes and coarse semantic categories, while performance degrades for rare labels and in detection-aware settings that penalize missed and spurious detections. Hierarchical analysis reveals that many fine-grained errors collapse into correct predictions at higher abstraction levels, indicating that most failures stem from intrinsic visual ambiguity in tray-leftover imagery rather than model instability. These findings highlight both the potential and the limitations of vision-only food waste recognition and motivate frequency-aware and hierarchy-based evaluation for realistic system assessment.
2025
Vision-Based Monitoring of Post-Consumer Food Waste in a University Canteen
Automated food waste monitoring in institutional catering environments such as university canteens requires robust visual recognition under severe class imbalance, high visual similarity, and partial consumption. This thesis investigates instance segmentation for food waste analysis using tray images collected after meal consumption in a university canteen. The dataset comprises 185 semantic classes, including prepared dishes, raw food items, non-edible objects, and an explicit background category representing unmatched detections. A YOLO-based instance segmentation model is trained and evaluated using standard object detection and segmentation metrics, including mean Average Precision (mAP), alongside class-level Precision, Recall, and F1-score derived from confusion matrices. To reflect different operational assumptions, performance is analyzed under three evaluation modes: strict closed-set, penalized closed-set, and end-to-end evaluation. In addition, a hierarchical evaluation framework is introduced to assess model behavior at multiple semantic resolutions without retraining. Results show strong performance for frequent classes and coarse semantic categories, while performance degrades for rare labels and in detection-aware settings that penalize missed and spurious detections. Hierarchical analysis reveals that many fine-grained errors collapse into correct predictions at higher abstraction levels, indicating that most failures stem from intrinsic visual ambiguity in tray-leftover imagery rather than model instability. These findings highlight both the potential and the limitations of vision-only food waste recognition and motivate frequency-aware and hierarchy-based evaluation for realistic system assessment.
Food waste
Computer vision
Post-consumer waste
Image-based analysis
YOLO
File in questo prodotto:
File Dimensione Formato  
RAZZAGHI_ALI.pdf

embargo fino al 19/09/2027

Dimensione 3.3 MB
Formato Adobe PDF
3.3 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/105213