Design, Implementation, and Comparative Assessment of Progressive Quality Decoding Methods for JPEG AI Bitstreams

The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation. JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources. In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies: 1. Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used. 2. Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm). 3. Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric. The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM. Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection. These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).

Design, Implementation, and Comparative Assessment of Progressive Quality Decoding Methods for JPEG AI Bitstreams

FRASSETTO, PIERO

2024/2025

Abstract

The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation. JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources. In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies: 1. Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used. 2. Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm). 3. Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric. The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM. Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection. These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				ICT FOR INTERNET AND MULTIMEDIA - INGEGNERIA PER LE COMUNICAZIONI MULTIMEDIALI E INTERNET Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Design, Implementation, and Comparative Assessment of Progressive Quality Decoding Methods for JPEG AI Bitstreams
			
	Abstract in italiano
	
				The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation.
JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources.
In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies:
1.	Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used.
2.	Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm).
3.	Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric.
The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM.
Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection.
These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).
			
	Parola chiave
	
				Progressive Decoding
JPEG AI
Bitstreams
Learned Compression
			
	Relatore
	
				MILANI, SIMONE
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Frassetto_Piero.pdf Accesso riservato Dimensione 8.81 MB Formato Adobe PDF	8.81 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/95826