The evaluation of Information Retrieval (IR) systems has traditionally relied on empirical methodologies such as the Cranfield paradigm, pooling, and large-scale evaluation campaigns. While effective in practice, these approaches lack a comprehensive theoretical foundation. This thesis addresses this gap by exploring evaluation through the lenses of measurement theory, Axiomatic Model of Preferences (AMP)s, and stochastic models of relevance. First, Representational Theory of Measurement (RTM) is applied to IR, clarifying the scale properties of evaluation measures and the admissibility of statistical operations. Building on this, axiomatic models of preferences are developed as order structures generated by axioms such as replacement and swap. These AMPs can be represented as distributive lattices, allowing IR measures to be interpreted as valuations, in that way clarifying their numeric, metric, and scale properties. The thesis then extends evaluation to stochastic relevance, treating relevance not as a deterministic but as a random variable. This perspective leads to probabilistic evaluation measures, such as Random Average Precision (RAP) and Random Rank-Biased Precision (RRBP), which explicitly model uncertainty in user behavior. The central contribution is the introduction of stochastic dominance as a bridge between AMPs and stochastic relevance. Stochastic dominance generalizes expectation-based comparisons by requiring superiority across the entire distribution of outcomes, while AMPs emerge as a special deterministic case. This unification provides a framework that integrates axiomatic and probabilistic approaches, offering both theoretical rigor and practical insight into IR evaluation, opening new roads for designing evaluation methodologies that better reflect the complexity of user behavior and judgment uncertainty.
Information Retrieval: a study on axiomatic models of preferences
MARENGON, ALESSIO
2024/2025
Abstract
The evaluation of Information Retrieval (IR) systems has traditionally relied on empirical methodologies such as the Cranfield paradigm, pooling, and large-scale evaluation campaigns. While effective in practice, these approaches lack a comprehensive theoretical foundation. This thesis addresses this gap by exploring evaluation through the lenses of measurement theory, Axiomatic Model of Preferences (AMP)s, and stochastic models of relevance. First, Representational Theory of Measurement (RTM) is applied to IR, clarifying the scale properties of evaluation measures and the admissibility of statistical operations. Building on this, axiomatic models of preferences are developed as order structures generated by axioms such as replacement and swap. These AMPs can be represented as distributive lattices, allowing IR measures to be interpreted as valuations, in that way clarifying their numeric, metric, and scale properties. The thesis then extends evaluation to stochastic relevance, treating relevance not as a deterministic but as a random variable. This perspective leads to probabilistic evaluation measures, such as Random Average Precision (RAP) and Random Rank-Biased Precision (RRBP), which explicitly model uncertainty in user behavior. The central contribution is the introduction of stochastic dominance as a bridge between AMPs and stochastic relevance. Stochastic dominance generalizes expectation-based comparisons by requiring superiority across the entire distribution of outcomes, while AMPs emerge as a special deterministic case. This unification provides a framework that integrates axiomatic and probabilistic approaches, offering both theoretical rigor and practical insight into IR evaluation, opening new roads for designing evaluation methodologies that better reflect the complexity of user behavior and judgment uncertainty.| File | Dimensione | Formato | |
|---|---|---|---|
|
Marengon_Alessio.pdf
Accesso riservato
Dimensione
1.96 MB
Formato
Adobe PDF
|
1.96 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/95504