Multi-Task Automated Classification of Imaging Acquisition Labels from DICOM Metadata using Large Language Models

Medical imaging data is embedded in DICOM metadata, which describe acquisition characteristics and scanner-specific parameters. While these metadata are fundamental for data downstream analysis in clinical field, especially in radiology, they are often incomplete and inconsistently populated, also due to the presence of different vendors conventions. This could introduce significant variability in features and hinder the interpretation of imaging semantics. This project investigated the feasibility of predicting and standardizing MRI and PET acquisition labels directly from heterogeneous textual DICOM header metadata exploiting Large Language Models (LLMs). It focused on the automatic classification of attributes that regard image formation, specifically sequence and contrast type, acquisition plane, modality, specific tracer, manufacturer, device model and the anatomical region of the body acquired. LLMs are well suited to this task because they can robustly parse various textual inputs, comprehending both partially structured and free-text DICOM attributes, and mapping them to a standardized and controlled label space. The study consisted of a series of experiments testing how well both general purpose GPT-based and open-source LLaMA 3 models could classify metadata. Firstly, an out-of-the-box (OOB) approach set a baseline performance. Subsequently, fine-tuning was performed for both GPT and LLaMA models to improve results by supervising the learning process with domain-specific insights. While GPT models demonstrated strong zero-shot and fine-tuned accuracy via OpenAI API Platform, LLaMA achieved better results in supervised classification settings. The performance highlighted the impact of manufacturer-specific nomenclature and the importance of both structured and free-text DICOM fields on model’s ability to extract the correct information. Finally, the work explored an extension of the framework utilizing the multimodal MedGemma model in a supervised learning environment, integrating imaging pixel data with textual metadata. Overall, this thesis demonstrated that LLMs could efficiently learn relevant imaging semantics from DICOM metadata alone, preserving the privacy and providing a scalable approach to metadata enrichment in radiology workflow.

Multi-Task Automated Classification of Imaging Acquisition Labels from DICOM Metadata using Large Language Models

VIAN, BEATRICE

2025/2026

Abstract

Medical imaging data is embedded in DICOM metadata, which describe acquisition characteristics and scanner-specific parameters. While these metadata are fundamental for data downstream analysis in clinical field, especially in radiology, they are often incomplete and inconsistently populated, also due to the presence of different vendors conventions. This could introduce significant variability in features and hinder the interpretation of imaging semantics. This project investigated the feasibility of predicting and standardizing MRI and PET acquisition labels directly from heterogeneous textual DICOM header metadata exploiting Large Language Models (LLMs). It focused on the automatic classification of attributes that regard image formation, specifically sequence and contrast type, acquisition plane, modality, specific tracer, manufacturer, device model and the anatomical region of the body acquired. LLMs are well suited to this task because they can robustly parse various textual inputs, comprehending both partially structured and free-text DICOM attributes, and mapping them to a standardized and controlled label space. The study consisted of a series of experiments testing how well both general purpose GPT-based and open-source LLaMA 3 models could classify metadata. Firstly, an out-of-the-box (OOB) approach set a baseline performance. Subsequently, fine-tuning was performed for both GPT and LLaMA models to improve results by supervising the learning process with domain-specific insights. While GPT models demonstrated strong zero-shot and fine-tuned accuracy via OpenAI API Platform, LLaMA achieved better results in supervised classification settings. The performance highlighted the impact of manufacturer-specific nomenclature and the importance of both structured and free-text DICOM fields on model’s ability to extract the correct information. Finally, the work explored an extension of the framework utilizing the multimodal MedGemma model in a supervised learning environment, integrating imaging pixel data with textual metadata. Overall, this thesis demonstrated that LLMs could efficiently learn relevant imaging semantics from DICOM metadata alone, preserving the privacy and providing a scalable approach to metadata enrichment in radiology workflow.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				BIOINGEGNERIA Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				Multi-Task Automated Classification of Imaging Acquisition Labels from DICOM Metadata using Large Language Models
			
	Parola chiave
	
				DICOM
Large Language Model
Imaging
Metadata
			
	Relatore
	
				CASTELLARO, MARCO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Vian_Beatrice.pdf accesso aperto Dimensione 6.31 MB Formato Adobe PDF Visualizza/Apri	6.31 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/107655