AI as a Second Opinion: Evaluating Workflow Timing in Clinical Psychology
(2026) PSYK12 20252Department of Psychology
- Abstract (Swedish)
- I denna studie utvärderades hur olika sätt att integrera AI-stöd i det kliniska
arbetsflödet påverkar diagnostisk precision och tidseffektivitet. Totalt 51
yrkesverksamma inom psykisk hälsa genomförde diagnostiska bedömningar av
simulerade patientfall i plattformen TalkToAlba. Deltagarna randomiserades
till en av tre grupper: proaktivt stöd (AI före patientintervjun), retroaktivt stöd
(AI före & efter patientintervjun) eller en kontrollgrupp där AI-stödet
introducerades först efter att en initial diagnos fastställts. Resultaten visade att
den slutgiltiga diagnostiska träffsäkerheten var högre i den retroaktiva än i den
proaktiva betingelsen (72.2 % mot 38.5 %), även om skillnaden inte nådde
statistisk signifikans (p = .079).... (More) - I denna studie utvärderades hur olika sätt att integrera AI-stöd i det kliniska
arbetsflödet påverkar diagnostisk precision och tidseffektivitet. Totalt 51
yrkesverksamma inom psykisk hälsa genomförde diagnostiska bedömningar av
simulerade patientfall i plattformen TalkToAlba. Deltagarna randomiserades
till en av tre grupper: proaktivt stöd (AI före patientintervjun), retroaktivt stöd
(AI före & efter patientintervjun) eller en kontrollgrupp där AI-stödet
introducerades först efter att en initial diagnos fastställts. Resultaten visade att
den slutgiltiga diagnostiska träffsäkerheten var högre i den retroaktiva än i den
proaktiva betingelsen (72.2 % mot 38.5 %), även om skillnaden inte nådde
statistisk signifikans (p = .079). Klinikernas träffsäkerhet var starkt korrelerad
med huruvida AI-systemets förslag var korrekt (p = .003), vilket tyder på en
sårbarhet vid felaktiga AI-svar. Den totala tidsåtgången skilde sig inte mellan
grupperna, däremot minskade AI-genererade journalanteckningar
dokumentationstiden (p = .048). Samstämmigheten mellan kliniker och AI var
högst när klinikerna arbetade utan stöd från AI (κ = .62). Klinikerna uttryckte
en stark vilja att använda AI i sitt arbete, samt en signifikant preferens att få
stöd efter intervjun snarare än före (p = .029). Sammantaget pekar studien på
vikten av ett människa-först-arbetsflöde där AI används som beslutsstöd efter
intervjun med patienten snarare än före. Genom att använda AI till att göra ett
andra utlåtande efter klinikerns egen bedömning kan man dra nytta av
effektivitetsvinster utan att kompromissa med den kliniska autonomin eller
riskera kognitiva snedvridningar som förankringseffekter och automationsbias. (Less) - Abstract
- This study investigated how AI support timing impacts clinicians’ diagnostic accuracy and efficiency during simulated psychiatric assessment using TalkToAlba, a clinical conversational AI platform. Professionals with experience in mental health (N = 51) were randomized into three conditions: control/delayed (no initial AI), retroactive (AI post-interview), or proactive (AI pre- and post-interview). Participants interviewed interactive AI-simulated patients in a chat interface, made a diagnostic assessment, and completed documentation. Final diagnostic accuracy was higher in the retroactive than in the proactive condition (72.2% vs. 38.5%), however, this difference did not reach statistical significance (Fisher’s exact test, p = .079) and... (More)
- This study investigated how AI support timing impacts clinicians’ diagnostic accuracy and efficiency during simulated psychiatric assessment using TalkToAlba, a clinical conversational AI platform. Professionals with experience in mental health (N = 51) were randomized into three conditions: control/delayed (no initial AI), retroactive (AI post-interview), or proactive (AI pre- and post-interview). Participants interviewed interactive AI-simulated patients in a chat interface, made a diagnostic assessment, and completed documentation. Final diagnostic accuracy was higher in the retroactive than in the proactive condition (72.2% vs. 38.5%), however, this difference did not reach statistical significance (Fisher’s exact test, p = .079) and should be interpreted cautiously given the uneven distribution of patients across conditions. Clinician accuracy was strongly associated with whether the AI system’s suggested diagnosis was correct (p = .003), indicating vulnerability when AI output was inaccurate. Total task duration did not differ across conditions, but AI-assisted journal note generation reduced documentation time (p = .048). Contrary to expectations, clinician-AI agreement was highest when clinicians worked independently (κ = .62). Clinicians reported a strong will to adopt AI clinical practices, as well as a significant preference for receiving support after the interview rather than before (p = .029). Findings suggest that a human-first approach where the AI acts as second opinion instead of using AI as an investigative lead, has potential to mitigate cognitive risks such as anchoring and automation bias (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9219857
- author
- Roxhage, André LU and Henriksson, Erik LU
- supervisor
- organization
- course
- PSYK12 20252
- year
- 2026
- type
- M2 - Bachelor Degree
- subject
- keywords
- Generative AI, clinical decision support, diagnostic accuracy, workflow efficiency, generativ AI, kliniskt beslutsstöd, diagnostisk träffsäkerhet, arbetseffektivitet
- language
- English
- id
- 9219857
- date added to LUP
- 2026-01-21 08:30:28
- date last changed
- 2026-01-21 08:30:28
@misc{9219857,
abstract = {{This study investigated how AI support timing impacts clinicians’ diagnostic accuracy and efficiency during simulated psychiatric assessment using TalkToAlba, a clinical conversational AI platform. Professionals with experience in mental health (N = 51) were randomized into three conditions: control/delayed (no initial AI), retroactive (AI post-interview), or proactive (AI pre- and post-interview). Participants interviewed interactive AI-simulated patients in a chat interface, made a diagnostic assessment, and completed documentation. Final diagnostic accuracy was higher in the retroactive than in the proactive condition (72.2% vs. 38.5%), however, this difference did not reach statistical significance (Fisher’s exact test, p = .079) and should be interpreted cautiously given the uneven distribution of patients across conditions. Clinician accuracy was strongly associated with whether the AI system’s suggested diagnosis was correct (p = .003), indicating vulnerability when AI output was inaccurate. Total task duration did not differ across conditions, but AI-assisted journal note generation reduced documentation time (p = .048). Contrary to expectations, clinician-AI agreement was highest when clinicians worked independently (κ = .62). Clinicians reported a strong will to adopt AI clinical practices, as well as a significant preference for receiving support after the interview rather than before (p = .029). Findings suggest that a human-first approach where the AI acts as second opinion instead of using AI as an investigative lead, has potential to mitigate cognitive risks such as anchoring and automation bias}},
author = {{Roxhage, André and Henriksson, Erik}},
language = {{eng}},
note = {{Student Paper}},
title = {{AI as a Second Opinion: Evaluating Workflow Timing in Clinical Psychology}},
year = {{2026}},
}