AI Task Allocation and Inference Offloading in Edge–Cloud Systems

AI applications increasingly rely on edge–cloud infrastructures to meet strict latency and freshness requirements. This thesis presents a survey of distributed AI task allocation and inference offloading strategies in edge–cloud networks under timeliness constraints. The analysis is organized by categorizing existing works according to server selection, task allocation, inference awareness, freshness metrics, and centralized versus distributed decision models. The survey highlights common trade-offs between latency, freshness, and system efficiency, and shows that distributed approaches often achieve near-optimal performance under realistic conditions.

AI Task Allocation and Inference Offloading in Edge–Cloud Systems

GOEL, JAYESH

2025/2026

Abstract

AI applications increasingly rely on edge–cloud infrastructures to meet strict latency and freshness requirements. This thesis presents a survey of distributed AI task allocation and inference offloading strategies in edge–cloud networks under timeliness constraints. The analysis is organized by categorizing existing works according to server selection, task allocation, inference awareness, freshness metrics, and centralized versus distributed decision models. The survey highlights common trade-offs between latency, freshness, and system efficiency, and shows that distributed approaches often achieve near-optimal performance under realistic conditions.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				INGEGNERIA DELL'INFORMAZIONE Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				AI Task Allocation and Inference Offloading in Edge–Cloud Systems
			
	Abstract in italiano
	
				AI applications increasingly rely on edge–cloud infrastructures to meet strict latency and freshness requirements. This thesis presents a survey of distributed AI task allocation and inference offloading strategies in edge–cloud networks under timeliness constraints. The analysis is organized by categorizing existing works according to server selection, task allocation, inference awareness, freshness metrics, and centralized versus distributed decision models. The survey highlights common trade-offs between latency, freshness, and system efficiency, and shows that distributed approaches often achieve near-optimal performance under realistic conditions.
			
	Parola chiave
	
				Edge Cloud Computing
AI Task Allocation
Age of Information
Game theory
Latency Constraints
			
	Relatore
	
				BADIA, LEONARDO
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
Goel_Jayesh.pdf accesso aperto Dimensione 861.28 kB Formato Adobe PDF Visualizza/Apri	861.28 kB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/104325