Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes. Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.

Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes. Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.

Data Efficient AI for Reduced Order Models: Meta-Learning the Optimal Training Dataset

GIAROLI, ALEX
2024/2025

Abstract

Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes. Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.
2024
Data Efficient AI for Reduced Order Models: Meta-Learning the Optimal Training Dataset
Machine learning (ML) models often lack physical intuition and rely on large datasets, without a systematic and scalable framework for creating more informative training data. This thesis tackles this issue by proposing a closed-loop Artificial Intelligence (AI) approach that integrates learning with optimization on top of existing AI modules, aimed at building reduced-order models (ROMs) through targeted data selection. This pipeline focuses on performance optimization of ML algorithms while reducing the time needed for data collection, thereby improving data efficiency. This enhancement directly translates into reduced experimental costs, accelerated development timelines, and minimized wear on physical test systems. To enhance system interpretability and trust, the framework integrates explainability studies that leverage a large language model. In a practical application with real-world vehicle data, this enabled the identification of eight critical clusters of maneuvers, consequently reducing the required testing duration from 4 hours to merely 20 minutes. Evaluated across challenging real-world and synthetic domains, including vehicle and electric motor dynamics, the closed-loop optimizer consistently outperformed unguided random sampling baselines. With guided iterations, the optimizer converged to a smaller training dataset that produced a more accurate model, outperforming the average of thousands of unguided, random attempts. This highly efficient search for informative data thereby enables the creation of high-fidelity surrogate models using over 90% less training data. These results establish this integrated approach as a robust and scalable solution for developing data-efficient, trustworthy ROMs for complex engineering systems and can be extended to non-ROM based applications.
Data Efficiency
Explainability
Reduced Order Models
Machine Learning
AI
File in questo prodotto:
File Dimensione Formato  
Giaroli_Alex.pdf

Accesso riservato

Dimensione 770.96 kB
Formato Adobe PDF
770.96 kB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/98954