Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.
Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.
Evaluating the Use of ChatGPT for Clinical Assessment in Pulmonology: A Pilot Study on Patient Acceptance and Perceived Usefulness
CAVASIN, GIULIA
2024/2025
Abstract
Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.| File | Dimensione | Formato | |
|---|---|---|---|
|
Cavasin_Giulia.pdf
Accesso riservato
Dimensione
930.49 kB
Formato
Adobe PDF
|
930.49 kB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/99151