Data Spaces: Building Trust in AI with Blockchain

Artificial intelligence is increasingly used in domains requiring trustworthy data, transparency, and auditability. The European Union promotes federated sector-specific Data Spaces to enable sovereign and interoperable data sharing, but integrating artificial intelligence workflows in such environments raises challenges related to data integrity, provenance, and accountability. Blockchain technology has been proposed as a mechanism to strengthen trust in data-driven systems due to its immutability and tamper-evident properties, yet empirical studies evaluating its computational feasibility within practical machine-learning pipelines remain limited. This thesis develops a conceptual framework connecting Data Space trust requirements with blockchain’s provenance capabilities, and implements a reproducible experiment in Google Colab. A synthetic but domain-plausible regenerative-agriculture dataset is combined with several regression models and a lightweight blockchain module that logs hashed artefacts from each step of the pipeline. The results show that blockchain provenance introduces negligible computational overhead, preserves predictive performance, and reliably detects tampering of raw data, preprocessed features, model artefacts, and evaluation metrics. The findings suggest that blockchain-backed provenance can enhance trust and accountability in artificial intelligence workflows deployed in federated data ecosystems such as regenerative-agriculture Data Spaces. The thesis concludes with a discussion of limitations, including the problem of ensuring data truthfulness at the moment of ingestion, and outlines directions for integrating trusted hardware, multi-source validation, and distributed ledger technologies within real-world Data Space deployments.

Data Spaces: Building Trust in AI with Blockchain

DAL MAS, GIOVANNI

2024/2025

Abstract

Artificial intelligence is increasingly used in domains requiring trustworthy data, transparency, and auditability. The European Union promotes federated sector-specific Data Spaces to enable sovereign and interoperable data sharing, but integrating artificial intelligence workflows in such environments raises challenges related to data integrity, provenance, and accountability. Blockchain technology has been proposed as a mechanism to strengthen trust in data-driven systems due to its immutability and tamper-evident properties, yet empirical studies evaluating its computational feasibility within practical machine-learning pipelines remain limited. This thesis develops a conceptual framework connecting Data Space trust requirements with blockchain’s provenance capabilities, and implements a reproducible experiment in Google Colab. A synthetic but domain-plausible regenerative-agriculture dataset is combined with several regression models and a lightweight blockchain module that logs hashed artefacts from each step of the pipeline. The results show that blockchain provenance introduces negligible computational overhead, preserves predictive performance, and reliably detects tampering of raw data, preprocessed features, model artefacts, and evaluation metrics. The findings suggest that blockchain-backed provenance can enhance trust and accountability in artificial intelligence workflows deployed in federated data ecosystems such as regenerative-agriculture Data Spaces. The thesis concludes with a discussion of limitations, including the problem of ensuring data truthfulness at the moment of ingestion, and outlines directions for integrating trusted hardware, multi-source validation, and distributed ledger technologies within real-world Data Space deployments.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Data Spaces: Building Trust in AI with Blockchain
			
	Abstract in italiano
	
				Artificial intelligence is increasingly used in domains requiring trustworthy data, transparency, and auditability. The European Union promotes federated sector-specific Data Spaces to enable sovereign and interoperable data sharing, but integrating artificial intelligence workflows in such environments raises challenges related to data integrity, provenance, and accountability. Blockchain technology has been proposed as a mechanism to strengthen trust in data-driven systems due to its immutability and tamper-evident properties, yet empirical studies evaluating its computational feasibility within practical machine-learning pipelines remain limited.

This thesis develops a conceptual framework connecting Data Space trust requirements with blockchain’s provenance capabilities, and implements a reproducible experiment in Google Colab. A synthetic but domain-plausible regenerative-agriculture dataset is combined with several regression models and a lightweight blockchain module that logs hashed artefacts from each step of the pipeline. The results show that blockchain provenance introduces negligible computational overhead, preserves predictive performance, and reliably detects tampering of raw data, preprocessed features, model artefacts, and evaluation metrics.

The findings suggest that blockchain-backed provenance can enhance trust and accountability in artificial intelligence workflows deployed in federated data ecosystems such as regenerative-agriculture Data Spaces. The thesis concludes with a discussion of limitations, including the problem of ensuring data truthfulness at the moment of ingestion, and outlines directions for integrating trusted hardware, multi-source validation, and distributed ledger technologies within real-world Data Space deployments.
			
	Parola chiave
	
				Data Spaces
Blockchain
Trust
Decentralization
Web3
			
	Relatore
	
				ERSEGHE, TOMASO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
DalMas_Giovanni.pdf accesso aperto Dimensione 939.42 kB Formato Adobe PDF Visualizza/Apri	939.42 kB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/102104