Colonoscopy is one of the most widely used and effective methods for the prevention and early detection of colorectal cancer, as it allows the identification and removal of precancerous polyps. To support endoscopists during the procedure, Computer-Aided Detection (CADe) systems based on deep learning have increasingly been adopted and usually rely on object detection models that, in real-time, locate and indicate the position of polyps within the image. From a computer vision perspective, the literature focuses on the development of object detectors based on the YOLO family, which are designed to operate in real-time and are therefore suitable for intraoperative use. However, if we move to the post-procedure phase, a period during which the video can be analyzed using more powerful and complex systems, the objective shifts from speed to accuracy. In this thesis, a benchmark is proposed for the evaluation of object detectors that prioritizes accuracy rather than inference speed. Object detectors based on both the YOLO family (YOLOv7, YOLOv11) and transformer-based models (RT-DETR) are compared through a unified and reproducible pipeline. Overall, this thesis provides a comprehensive and clinically relevant assessment of modern object detection models for polyp detection, offering insights into their strengths, limitations, and the factors influencing their robustness in realistic colonoscopy scenarios.
Comparative Evaluation of Object Detection Models in Colonoscopy
PRENDIN, LAURA
2024/2025
Abstract
Colonoscopy is one of the most widely used and effective methods for the prevention and early detection of colorectal cancer, as it allows the identification and removal of precancerous polyps. To support endoscopists during the procedure, Computer-Aided Detection (CADe) systems based on deep learning have increasingly been adopted and usually rely on object detection models that, in real-time, locate and indicate the position of polyps within the image. From a computer vision perspective, the literature focuses on the development of object detectors based on the YOLO family, which are designed to operate in real-time and are therefore suitable for intraoperative use. However, if we move to the post-procedure phase, a period during which the video can be analyzed using more powerful and complex systems, the objective shifts from speed to accuracy. In this thesis, a benchmark is proposed for the evaluation of object detectors that prioritizes accuracy rather than inference speed. Object detectors based on both the YOLO family (YOLOv7, YOLOv11) and transformer-based models (RT-DETR) are compared through a unified and reproducible pipeline. Overall, this thesis provides a comprehensive and clinically relevant assessment of modern object detection models for polyp detection, offering insights into their strengths, limitations, and the factors influencing their robustness in realistic colonoscopy scenarios.| File | Dimensione | Formato | |
|---|---|---|---|
|
Prendin_Laura.pdf
accesso aperto
Dimensione
3.24 MB
Formato
Adobe PDF
|
3.24 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/102131