Data Efficient AI for Reduced Order Models: Meta-Learning the Optimal Training Dataset

Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes. Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.

Data Efficient AI for Reduced Order Models: Meta-Learning the Optimal Training Dataset

GIAROLI, ALEX

2024/2025

Abstract

Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes. Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				CONTROL SYSTEMS ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Data Efficient AI for Reduced Order Models: Meta-Learning the Optimal Training Dataset
			
	Abstract in italiano
	
				Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes.
Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.
			
	Parola chiave
	
				Data Efficiency
Explainability
Reduced Order Models
Machine Learning
AI
			
	Relatore
	
				SUSTO, GIAN ANTONIO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Giaroli_Alex.pdf Accesso riservato Dimensione 770.96 kB Formato Adobe PDF	770.96 kB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/98954