Scene Graphs (SGs) are Knowledge Graphs representing the contents of an image in terms of its elements, e.g., people, objects, and attributes, as well as their relationships. Hence, they can capture the scene’s structural and semantic organization and express it in a machine-readable model. Thanks to their expressive power, SGs have been applied in important Image Processing tasks such as Image Captioning, Visual Question Answering, and Image Search. In this work, we focus on Image Search, which, given a query, is the task of retrieving the images that best match the text given as input. The task of determining which images are the best matches for a given query becomes more and more challenging as the details and the complexity of the query increase. While several approaches have been proposed to tackle the Image-Search task by adopting pre-trained language models, the opportunity to use both a Scene Graph and a language model has not been studied yet. In this work, we employ an SG representation and a pre-trained language model with the purpose of improving the Image Search performance when dealing with complex textual queries.

Semantic Aware Image Search with Scene Knowledge Graphs

LOREGGIA, GIACOMO
2021/2022

Abstract

Scene Graphs (SGs) are Knowledge Graphs representing the contents of an image in terms of its elements, e.g., people, objects, and attributes, as well as their relationships. Hence, they can capture the scene’s structural and semantic organization and express it in a machine-readable model. Thanks to their expressive power, SGs have been applied in important Image Processing tasks such as Image Captioning, Visual Question Answering, and Image Search. In this work, we focus on Image Search, which, given a query, is the task of retrieving the images that best match the text given as input. The task of determining which images are the best matches for a given query becomes more and more challenging as the details and the complexity of the query increase. While several approaches have been proposed to tackle the Image-Search task by adopting pre-trained language models, the opportunity to use both a Scene Graph and a language model has not been studied yet. In this work, we employ an SG representation and a pre-trained language model with the purpose of improving the Image Search performance when dealing with complex textual queries.
2021
Semantic Aware Image Search with Scene Knowledge Graphs
Scene Graphs
Image Search
Language Models
File in questo prodotto:
File Dimensione Formato  
Loreggia_Giacomo.pdf

accesso aperto

Dimensione 6.19 MB
Formato Adobe PDF
6.19 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/36544