Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.

Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.

Evaluating the Use of ChatGPT for Clinical Assessment in Pulmonology: A Pilot Study on Patient Acceptance and Perceived Usefulness

CAVASIN, GIULIA
2024/2025

Abstract

Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.
2024
Evaluating the Use of ChatGPT for Clinical Assessment in Pulmonology: A Pilot Study on Patient Acceptance and Perceived Usefulness
Background: Large language models such as ChatGPT are emerging as innovative tools for supporting patient communication and clinical practice. While previous research has demonstrated their potential, little is known about patients’ perceptions of its use during hospitalization through real-time interactions. Aim: To evaluate the feasibility and patient acceptability of ChatGPT-4o, configured through iterative prompt engineering, as a virtual nurse conducting a basic clinical assessment in an inpatient pulmonology setting. Methods: A cross-sectional observational pilot study was conducted in a pulmonology ward in Northern Italy. Twenty-three inpatients engaged in one to three voice-based structured interview interactions with ChatGPT-4o. Patient acceptance was assessed using the validated Italian AIDUA questionnaire and two additional Likert items on perceived empathy and usefulness. A panel of two physicians and two nurses independently rated each AI-patient conversation for empathy, usefulness, correctness, and potential harm. Non-parametric analyses (Spearman correlations, Mann–Whitney U, Kruskal–Wallis tests) were performed to explore associations between acceptance outcomes and participant characteristics. Results: Patients completed 50 structured clinical interview interactions with the AI-based virtual nurse during hospitalization. They reported high perceived empathy and usefulness of the interactions (median = 5). Emotions and Hedonic Motivation were the highest AIDUA dimensions, whereas Performance Expectancy and Anthropomorphism were lower. Objection exceeded Intention to Use, indicating a cautious stance toward adoption. Clinical experts evaluated most AI-generated responses as empathetic, useful and clinically correct; however, 16% (8/50) were judged potentially unsafe, primarily due to reduced clinical adequacy and incomplete assessment. Conclusion: ChatGPT-4o, used as a supervised virtual nurse, was technically feasible and generally perceived as empathetic and useful, yet patients showed higher objection than intention to use and clinicians flagged a subset of responses as potentially unsafe. LLM-based virtual nurses may assist symptom elicitation and patient engagement, but their use must remain under continuous clinical oversight due to safety limitations.
ChatGPT
patient acceptance
Large Language Model
digital health
clinical assessment
File in questo prodotto:
File Dimensione Formato  
Cavasin_Giulia.pdf

Accesso riservato

Dimensione 930.49 kB
Formato Adobe PDF
930.49 kB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/99151