This thesis investigates character identification in a narrative told from the perspective of a morally controversial and unreliable character, adopting a corpus-based and computational approach to reader responses. Character identification is a psychological process involving a temporary loss of self-awareness as the reader adopts the identity and perspective of a character (Cohen, 2001). Focusing on "Lolita" (1955) by Vladimir Nabokov, the study examines readers’ identification with its controversial narrator, Humbert Humbert, in English-language reviews collected from the social reading platform Goodreads. Drawing on empirical literary and media studies, the analysis is inspired by recent work on the computational detection of reader absorption in social book reviews (Kuijpers et al., 2024). To this end, a novel corpus of user-generated reviews was manually annotated to identify textual evidence of the presence or rejection of character identification, formulated as a binary classification task and validated through inter-annotator agreement. Annotation guidelines were adapted from questionnaires developed in empirical reading research on identification. Finally, the linguistic features of the annotated reviews were extracted and analysed using the web-based tool Profiling-UD to investigate whether character identification or its absence was associated with specific linguistic features. In addition, sentiment analysis was conducted using VADER to gain further insight into their lexical and evaluative features, and the study then assessed the potential correlation between character identification and the reader’s appreciation of the text, as expressed through rating.

Character Identification with a Morally Controversial Protagonist: A Computational Analysis of "Lolita" Goodreads Reviews

SALVIA, GIULIA
2025/2026

Abstract

This thesis investigates character identification in a narrative told from the perspective of a morally controversial and unreliable character, adopting a corpus-based and computational approach to reader responses. Character identification is a psychological process involving a temporary loss of self-awareness as the reader adopts the identity and perspective of a character (Cohen, 2001). Focusing on "Lolita" (1955) by Vladimir Nabokov, the study examines readers’ identification with its controversial narrator, Humbert Humbert, in English-language reviews collected from the social reading platform Goodreads. Drawing on empirical literary and media studies, the analysis is inspired by recent work on the computational detection of reader absorption in social book reviews (Kuijpers et al., 2024). To this end, a novel corpus of user-generated reviews was manually annotated to identify textual evidence of the presence or rejection of character identification, formulated as a binary classification task and validated through inter-annotator agreement. Annotation guidelines were adapted from questionnaires developed in empirical reading research on identification. Finally, the linguistic features of the annotated reviews were extracted and analysed using the web-based tool Profiling-UD to investigate whether character identification or its absence was associated with specific linguistic features. In addition, sentiment analysis was conducted using VADER to gain further insight into their lexical and evaluative features, and the study then assessed the potential correlation between character identification and the reader’s appreciation of the text, as expressed through rating.
2025
Character Identification with a Morally Controversial Protagonist: A Computational Analysis of "Lolita" Goodreads Reviews
Identification
Social reading
Lolita
Supervised learning
Corpus-based
File in questo prodotto:
File Dimensione Formato  
Salvia_Giulia.pdf

Accesso riservato

Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/108769