Conversational Information Retrieval (IR) has gained significant attention in recent years, driven by the increasing prevalence of voice assistants and chatbots. Besides, the remarkable progress of Natural Language Processing (NLP) and Artificial Intelli- gence (AI) technologies have made it possible to build more sophisticated and effective Conversational Search (CS) systems. However, while a significant amount of effort has been directed towards the improvement of these systems, until recent times, relatively limited success has been obtained in evaluating their quality. Even if it has been shown that it is not optimal to apply traditional evaluation metrics to conversational systems, most of the proposed solutions are traditional IR metrics adapted to search sessions. However, their requirement for gaining access to actual user sessions is quite restrictive. Given this context, we start our discussion from the offline evaluation framework introduced by Lipani et al. in their paper titled "How Am I Doing?: Evaluating Conversational Search Systems Offline.", that wins the above constraints and we offer the following contributions: • A browsing model that includes the one outlined by Lipani et al. and extends its applicability to situations involving stochastic relevance. • Two new evaluation metrics. • A comparison of some of these measures’ performance in representing the user satisfaction after a search. • A further analysis on the dataset and the hyperparameters considered in the papaer. In this project randomness plays a crucial role: it is involved in the development of the browsing model and in the definition of the relevance retrieved by the user. It is indeed our intention to show that treating relevance judgments as deterministic processes does not entirely capture their inherent nature.

Evaluation of Markovian Models for Conversational Search

DAL CHECCO, LAURA
2022/2023

Abstract

Conversational Information Retrieval (IR) has gained significant attention in recent years, driven by the increasing prevalence of voice assistants and chatbots. Besides, the remarkable progress of Natural Language Processing (NLP) and Artificial Intelli- gence (AI) technologies have made it possible to build more sophisticated and effective Conversational Search (CS) systems. However, while a significant amount of effort has been directed towards the improvement of these systems, until recent times, relatively limited success has been obtained in evaluating their quality. Even if it has been shown that it is not optimal to apply traditional evaluation metrics to conversational systems, most of the proposed solutions are traditional IR metrics adapted to search sessions. However, their requirement for gaining access to actual user sessions is quite restrictive. Given this context, we start our discussion from the offline evaluation framework introduced by Lipani et al. in their paper titled "How Am I Doing?: Evaluating Conversational Search Systems Offline.", that wins the above constraints and we offer the following contributions: • A browsing model that includes the one outlined by Lipani et al. and extends its applicability to situations involving stochastic relevance. • Two new evaluation metrics. • A comparison of some of these measures’ performance in representing the user satisfaction after a search. • A further analysis on the dataset and the hyperparameters considered in the papaer. In this project randomness plays a crucial role: it is involved in the development of the browsing model and in the definition of the relevance retrieved by the user. It is indeed our intention to show that treating relevance judgments as deterministic processes does not entirely capture their inherent nature.
2022
Evaluation of Markovian Models for Conversational Search
Markovian Models
IR
ConversationalSearch
File in questo prodotto:
File Dimensione Formato  
Dal Checco_Laura.pdf

accesso aperto

Dimensione 573.89 kB
Formato Adobe PDF
573.89 kB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/60683