Generative Artificial Intelligence (GenAI) has experienced significant growth in recent years. However, Artificial Intelligence (AI) generated images often struggle to maintain harmony between input and output images, leading to quality issues in both semantic and visual domains, and these challenges have driven the development of new solutions and new metrics to assess the AI generated images. In visual terms, these assessments include measuring pixel-level similarity to ground truth, comparing luminance, contrast, and structure and, in semantic terms, techniques such as comparing segmented body parts against anatomical norms, using pose estimation models (OpenPose, MediaPipe) to flag deviations. While there exist these abovementioned models that need a “reference” image, there are other models that do not need any reference image. This is called No-reference Image Quality Assessment (NR-IQA) and there are some famous metrics such as BRISQUE, NIQE. This thesis examines the possibility of a sort of pipeline that comments the GenAI compressed images in terms of their quality and compares the chosen model (ZeroFake) with some NR-IQA metrics used during the project, discussing their advantages in relation to the specific content and context of the images.
Generative Artificial Intelligence (GenAI) has experienced significant growth in recent years. However, Artificial Intelligence (AI) generated images often struggle to maintain harmony between input and output images, leading to quality issues in both semantic and visual domains, and these challenges have driven the development of new solutions and new metrics to assess the AI generated images. In visual terms, these assessments include measuring pixel-level similarity to ground truth, comparing luminance, contrast, and structure and, in semantic terms, techniques such as comparing segmented body parts against anatomical norms, using pose estimation models (OpenPose, MediaPipe) to flag deviations. While there exist these abovementioned models that need a “reference” image, there are other models that do not need any reference image. This is called No-reference Image Quality Assessment (NR-IQA) and there are some famous metrics such as BRISQUE, NIQE. This thesis examines the possibility of a sort of pipeline that comments the GenAI compressed images in terms of their quality and compares the chosen model (ZeroFake) with some NR-IQA metrics used during the project, discussing their advantages in relation to the specific content and context of the images.
A perceptual analysis of GenAI compressed images
DAL, MUSTAFA
2024/2025
Abstract
Generative Artificial Intelligence (GenAI) has experienced significant growth in recent years. However, Artificial Intelligence (AI) generated images often struggle to maintain harmony between input and output images, leading to quality issues in both semantic and visual domains, and these challenges have driven the development of new solutions and new metrics to assess the AI generated images. In visual terms, these assessments include measuring pixel-level similarity to ground truth, comparing luminance, contrast, and structure and, in semantic terms, techniques such as comparing segmented body parts against anatomical norms, using pose estimation models (OpenPose, MediaPipe) to flag deviations. While there exist these abovementioned models that need a “reference” image, there are other models that do not need any reference image. This is called No-reference Image Quality Assessment (NR-IQA) and there are some famous metrics such as BRISQUE, NIQE. This thesis examines the possibility of a sort of pipeline that comments the GenAI compressed images in terms of their quality and compares the chosen model (ZeroFake) with some NR-IQA metrics used during the project, discussing their advantages in relation to the specific content and context of the images.| File | Dimensione | Formato | |
|---|---|---|---|
|
Dal_Mustafa.pdf
accesso aperto
Dimensione
4.2 MB
Formato
Adobe PDF
|
4.2 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/92490