This thesis investigates the task of automated multimodal deception detection on an Italian video dataset that integrates visual, audio, and textual information. Existing deception detection research has largely concentrated on unimodal signals or English-language corpora, leaving a gap in cross-linguistic and multimodal analysis. To address this, a comprehensive framework that explores both traditional machine learning baselines and advanced deep learning architectures for individual modalities is designed and evaluated. Beyond unimodal analysis, multiple fusion strategies are applied, including feature-level integration and decision-level ensemble methods, in order to capture complementary cues across modalities. The experimental evaluation demonstrates that multimodal fusion outperforms unimodal models. In addition, the relative contribution of each modality is analyzed and the features that prove most discriminative for the detection of deceptive behavior are highlighted. Taken together, these findings underscore the importance of leveraging multimodal data for deception detection.
This thesis investigates the task of automated multimodal deception detection on an Italian video dataset that integrates visual, audio, and textual information. Existing deception detection research has largely concentrated on unimodal signals or English-language corpora, leaving a gap in cross-linguistic and multimodal analysis. To address this, a comprehensive framework that explores both traditional machine learning baselines and advanced deep learning architectures for individual modalities is designed and evaluated. Beyond unimodal analysis, multiple fusion strategies are applied, including feature-level integration and decision-level ensemble methods, in order to capture complementary cues across modalities. The experimental evaluation demonstrates that multimodal fusion outperforms unimodal models. In addition, the relative contribution of each modality is analyzed and the features that prove most discriminative for the detection of deceptive behavior are highlighted. Taken together, these findings underscore the importance of leveraging multimodal data for deception detection.
Multimodal Fusion for Deception Detection on a Dataset of videotaped interviews in Italian
CAEIRO NEVES SILVEIRA JESUS, INÊS
2024/2025
Abstract
This thesis investigates the task of automated multimodal deception detection on an Italian video dataset that integrates visual, audio, and textual information. Existing deception detection research has largely concentrated on unimodal signals or English-language corpora, leaving a gap in cross-linguistic and multimodal analysis. To address this, a comprehensive framework that explores both traditional machine learning baselines and advanced deep learning architectures for individual modalities is designed and evaluated. Beyond unimodal analysis, multiple fusion strategies are applied, including feature-level integration and decision-level ensemble methods, in order to capture complementary cues across modalities. The experimental evaluation demonstrates that multimodal fusion outperforms unimodal models. In addition, the relative contribution of each modality is analyzed and the features that prove most discriminative for the detection of deceptive behavior are highlighted. Taken together, these findings underscore the importance of leveraging multimodal data for deception detection.| File | Dimensione | Formato | |
|---|---|---|---|
|
Thesis_InesJesus.pdf
Accesso riservato
Dimensione
1.05 MB
Formato
Adobe PDF
|
1.05 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/102100