Accelerating Deep Learning Workloads: a Comparative Study of GPUs, TPUs, FPGAs, and ASICs

Deep learning has revolutionized various fields, including computer vision, nat- ural language processing, and autonomous systems. However, these powerful deep learning models, especially convolutional neural networks, require a lot of computing power in order to work effectively. This thesis project will explore the role of hardware accelerators specially GPUs, TPUs, FPGAs, and ASICs in enhancing the performance of deep learning models. GPUs are known for their excellent parallel computing power and are widely used in many applications but they suffer from high power consumption and they are expensive. FPGAs provide a balance between good performance capabilities and power consumption, with the added advantage of being configurable, making them suitable for many applications in deep learning. TPUs, developed specifically for tensor operations, offer significant performance improvements for deep learning tasks but they lack flexibility that we need in many deep learning applications. ASICs, designed for specific tasks, provide unrivaled performance and energy efficiency but they are highly limited in terms of flexibility. After providing an overview of these accelerators, focusing on their architectural features, benefits and performance metrics in different applications, the manuscript will present a critical analysis of the various solutions.

Accelerating Deep Learning Workloads: a Comparative Study of GPUs, TPUs, FPGAs, and ASICs

MAGHSOUD, AMIRKHASHAYAR

2024/2025

Abstract

Deep learning has revolutionized various fields, including computer vision, nat- ural language processing, and autonomous systems. However, these powerful deep learning models, especially convolutional neural networks, require a lot of computing power in order to work effectively. This thesis project will explore the role of hardware accelerators specially GPUs, TPUs, FPGAs, and ASICs in enhancing the performance of deep learning models. GPUs are known for their excellent parallel computing power and are widely used in many applications but they suffer from high power consumption and they are expensive. FPGAs provide a balance between good performance capabilities and power consumption, with the added advantage of being configurable, making them suitable for many applications in deep learning. TPUs, developed specifically for tensor operations, offer significant performance improvements for deep learning tasks but they lack flexibility that we need in many deep learning applications. ASICs, designed for specific tasks, provide unrivaled performance and energy efficiency but they are highly limited in terms of flexibility. After providing an overview of these accelerators, focusing on their architectural features, benefits and performance metrics in different applications, the manuscript will present a critical analysis of the various solutions.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				INGEGNERIA DELL'INFORMAZIONE Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Accelerating Deep Learning Workloads: a Comparative Study of GPUs, TPUs, FPGAs, and ASICs
			
	Abstract in italiano
	
				Deep learning has revolutionized various fields, including computer vision, nat- ural language processing, and autonomous systems. However, these powerful deep learning models, especially convolutional neural networks, require a lot of computing power in order to work effectively. This thesis project will explore the role of hardware accelerators specially GPUs, TPUs, FPGAs, and  ASICs  in  enhancing  the performance of deep learning models.
GPUs are known for their excellent parallel computing power and are widely used in many applications but they suffer from high power consumption and they are expensive. FPGAs provide a balance between good performance capabilities and power consumption, with the added advantage of being configurable, making them suitable for many applications in deep learning. TPUs, developed specifically for tensor operations, offer significant performance improvements for deep learning tasks but they lack flexibility that we need in many deep learning applications. ASICs, designed for specific tasks, provide unrivaled performance and energy efficiency but they are highly limited in terms of flexibility. After providing an overview of these accelerators, focusing on their architectural features, benefits and performance metrics in different applications, the manuscript will present a critical analysis of the various solutions.
			
	Parola chiave
	
				machine learning
hardware accelerator
deep learning
			
	Relatore
	
				BAGATIN, MARTA
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
Maghsoud_Amirkhashayar.pdf accesso riservato Dimensione 3.77 MB Formato Adobe PDF	3.77 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/82742