The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation. JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources. In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies: 1. Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used. 2. Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm). 3. Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric. The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM. Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection. These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).

The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation. JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources. In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies: 1. Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used. 2. Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm). 3. Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric. The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM. Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection. These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).

Design, Implementation, and Comparative Assessment of Progressive Quality Decoding Methods for JPEG AI Bitstreams

FRASSETTO, PIERO
2024/2025

Abstract

The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation. JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources. In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies: 1. Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used. 2. Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm). 3. Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric. The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM. Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection. These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).
2024
Design, Implementation, and Comparative Assessment of Progressive Quality Decoding Methods for JPEG AI Bitstreams
The number of photos captured worldwide is rapidly increasing, driven by widespread smartphone adoption and the growing role of images in digital communication. To address the challenges of managing this visual data explosion, next-generation AI-based image coding solutions are emerging. JPEG AI has been developed by the Joint Photographic Experts Group (JPEG) and aims to deliver a compact, single bitstream compressed domain representation optimized for both human viewing and machine-based tasks like image classification, object detection, and semantic segmentation. JPEG AI supports progressive decoding, which permits images to be reconstructed incrementally from a single bitstream. This capability is relevant in contexts demanding reduced latency, timely content availability, or efficient use of computational resources. In this context, this thesis has the objective to develop, implement, and assess progressive decoding methods from a single, non-scalable JPEG AI bitstream. The work investigates three main strategies: 1. Progressive decoding by sequential channel truncation, where the default JPEG AI channel order is used. 2. Progressive decoding by adaptive channel selection, where channels are selected according to an energy metric (entropy or L2 norm). 3. Progressive decoding by optimal channel selection, where channels are optimized for a specific quality metric, providing a theoretical upper bound for progressive decoding performance under that metric. The proposed methods are tested on JPEG AI Models 2 and 3, using the recommended JPEG AI image test set. Rate–distortion (RD) performance is evaluated through BD-Rate analysis across multiple objective quality metrics, notably MS-SSIM, IW-SSIM, VMAF, VIF, PSNR-HVS, NLPD, and FSIM. Experimental results show that Model 2 consistently outperforms Model 3 in RD performance under the same progressive decoding method. Furthermore, adaptive selection methods yield better RD performance compared to simple sequential channel truncation, with L2 norm-based selection providing the highest compression gains across both models compared to entropy-based selection. These findings suggest that for resource-constrained scenarios requiring a single-model solution, Model 2 combined with adaptive channel selection, particularly L2 norm-based, represents a very effective method for achieving progressive quality. These findings were presented to the JPEG committee and used to select the best model for the JPEG AI single-model level (with Model 2 and not Model 3 as could be expected).
Progressive Decoding
JPEG AI
Bitstreams
Learned Compression
File in questo prodotto:
File Dimensione Formato  
Frassetto_Piero.pdf

Accesso riservato

Dimensione 8.81 MB
Formato Adobe PDF
8.81 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/95826