In the last years, the field of AI exploded leading many institutions, companies, and researchers to invest vast amounts of money to improve models that either predict a human behavior, imitate it, or help the user to extrapolate information from a dataset that otherwise would have been hidden. Today the computational part that powers AI algorithms is shifting form huge mainframes to tiny, embedded devices, “on the edge” with respect to data centers, raising problems in terms of required power and speed. This thesis reports and implements various hardware (and software) optimizations, both adapted from the literature and original, with the aim of enabling a proprietary core, which is supposed to work closely with sensors, hence, with a small form factor, to run neural networks efficiently and enhancing the inference time of the most common layers. The result of this research is the implementation of optimized hardware accelerators that use quantized values to shrink the amount of data and parallelize as much as possible, also implementing in the firmware approximated versions of some useful well-known functions to prevent heavy computations during the inference process.

Instruction set optimizations for Edge AI computing on a low-area microprocessor

PAFFI, LEONARDO
2021/2022

Abstract

In the last years, the field of AI exploded leading many institutions, companies, and researchers to invest vast amounts of money to improve models that either predict a human behavior, imitate it, or help the user to extrapolate information from a dataset that otherwise would have been hidden. Today the computational part that powers AI algorithms is shifting form huge mainframes to tiny, embedded devices, “on the edge” with respect to data centers, raising problems in terms of required power and speed. This thesis reports and implements various hardware (and software) optimizations, both adapted from the literature and original, with the aim of enabling a proprietary core, which is supposed to work closely with sensors, hence, with a small form factor, to run neural networks efficiently and enhancing the inference time of the most common layers. The result of this research is the implementation of optimized hardware accelerators that use quantized values to shrink the amount of data and parallelize as much as possible, also implementing in the firmware approximated versions of some useful well-known functions to prevent heavy computations during the inference process.
2021
Instruction set optimizations for Edge AI computing on a low-area microprocessor
ai computing
digital electronics
instruction set
ai optimizations
microprocessor
File in questo prodotto:
File Dimensione Formato  
Paffi_Leonardo.pdf

embargo fino al 05/06/2024

Dimensione 3.52 MB
Formato Adobe PDF
3.52 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/39697