Clinical Decision-Making of Artificial Intelligence vs Medical Professionals in Patients With Syncope
(2025) In JACC: Advances 5(1). p.102426-102426- Abstract
BACKGROUND: Artificial intelligence may improve diagnostic yield and accuracy in syncope.
OBJECTIVES: The purpose of this study was to compare Generative Pretrained Transformer 4-Omni (GPT-4o) with medical professionals (MPs) in establishing syncope diagnoses and recommending interventions based on general practitioner's referral letters to a syncope-unit.
METHODS: This three-phase study evaluated 55 anonymized referral letters. Phase-1: GPT-4o and MPs (12 physicians, 6 allied professionals) provided differential diagnoses. In Phase-2: all patients underwent 1.5 years of follow-up for recurrences and additional investigations. In Phase-3: a multidisciplinary committee established final diagnoses by adjudication. Diagnostic... (More)
BACKGROUND: Artificial intelligence may improve diagnostic yield and accuracy in syncope.
OBJECTIVES: The purpose of this study was to compare Generative Pretrained Transformer 4-Omni (GPT-4o) with medical professionals (MPs) in establishing syncope diagnoses and recommending interventions based on general practitioner's referral letters to a syncope-unit.
METHODS: This three-phase study evaluated 55 anonymized referral letters. Phase-1: GPT-4o and MPs (12 physicians, 6 allied professionals) provided differential diagnoses. In Phase-2: all patients underwent 1.5 years of follow-up for recurrences and additional investigations. In Phase-3: a multidisciplinary committee established final diagnoses by adjudication. Diagnostic performance was assessed using a custom Diagnostic Precision Score (DPS), penalizing incorrect differential diagnoses from Phase-1. GPT-4o was tested in a privacy-safe environment and instructed with European Society of Cardiology guidelines.
RESULTS: Fifty-five letters were independently analyzed once by each of the eighteen MPs and by GPT-4o, yielding 1,045 assessments. Diagnostic yield, defined as any suggestion of a diagnosis, was 81.9% for physicians, 84.5% allied professionals, and 100% GPT-4o. Diagnostic performance, defined as the presence of the final diagnosis in the initial differential diagnosis, was 75.9% for GPT-4o, 48.6% and 36.7% for physicians and allied professionals. DPS was 22.9% for physicians (148.75/648), 12.6% for allied professionals (40.75/324), and -6.9% for GPT-4o (-4.00/54). GPT-4o incorrectly labeled 3 of 4 cardiac diagnoses as reflex syncope. GPT-4o, but not MPs, suggested additional lifestyle measures such as counterpressure maneuvers (29/55; 52.7%) and increased fluid intake (28/55; 50.9%).
CONCLUSIONS: GPT-4o proposed a diagnosis in all cases; however, with a low DPS and is not yet suitable for unsupervised clinical use interpreting referral letters.
(Less)
- author
- organization
- publishing date
- 2025-12-19
- type
- Contribution to journal
- publication status
- epub
- subject
- in
- JACC: Advances
- volume
- 5
- issue
- 1
- pages
- 102426 - 102426
- publisher
- American College of Cardiology
- external identifiers
-
- pmid:41421015
- ISSN
- 2772-963X
- DOI
- 10.1016/j.jacadv.2025.102426
- language
- English
- LU publication?
- yes
- additional info
- Copyright © 2026 The Authors. Published by Elsevier Inc. All rights reserved.
- id
- 4f911d5e-5aea-411c-a91f-ed9776596b27
- date added to LUP
- 2025-12-22 15:57:55
- date last changed
- 2025-12-23 09:07:26
@article{4f911d5e-5aea-411c-a91f-ed9776596b27,
abstract = {{<p>BACKGROUND: Artificial intelligence may improve diagnostic yield and accuracy in syncope.</p><p>OBJECTIVES: The purpose of this study was to compare Generative Pretrained Transformer 4-Omni (GPT-4o) with medical professionals (MPs) in establishing syncope diagnoses and recommending interventions based on general practitioner's referral letters to a syncope-unit.</p><p>METHODS: This three-phase study evaluated 55 anonymized referral letters. Phase-1: GPT-4o and MPs (12 physicians, 6 allied professionals) provided differential diagnoses. In Phase-2: all patients underwent 1.5 years of follow-up for recurrences and additional investigations. In Phase-3: a multidisciplinary committee established final diagnoses by adjudication. Diagnostic performance was assessed using a custom Diagnostic Precision Score (DPS), penalizing incorrect differential diagnoses from Phase-1. GPT-4o was tested in a privacy-safe environment and instructed with European Society of Cardiology guidelines.</p><p>RESULTS: Fifty-five letters were independently analyzed once by each of the eighteen MPs and by GPT-4o, yielding 1,045 assessments. Diagnostic yield, defined as any suggestion of a diagnosis, was 81.9% for physicians, 84.5% allied professionals, and 100% GPT-4o. Diagnostic performance, defined as the presence of the final diagnosis in the initial differential diagnosis, was 75.9% for GPT-4o, 48.6% and 36.7% for physicians and allied professionals. DPS was 22.9% for physicians (148.75/648), 12.6% for allied professionals (40.75/324), and -6.9% for GPT-4o (-4.00/54). GPT-4o incorrectly labeled 3 of 4 cardiac diagnoses as reflex syncope. GPT-4o, but not MPs, suggested additional lifestyle measures such as counterpressure maneuvers (29/55; 52.7%) and increased fluid intake (28/55; 50.9%).</p><p>CONCLUSIONS: GPT-4o proposed a diagnosis in all cases; however, with a low DPS and is not yet suitable for unsupervised clinical use interpreting referral letters.</p>}},
author = {{van Zanten, Steven and Boel, Thomas T and de Jong, Jelle Sy and Bais, Babette and Fedorowski, Artur and Sutton, Richard and Selder, Jasper L and Giele, Freek and Geertsma, Christiaan and Scheffer, Mike G and de Groot, Joris R and de Lange, Frederik J}},
issn = {{2772-963X}},
language = {{eng}},
month = {{12}},
number = {{1}},
pages = {{102426--102426}},
publisher = {{American College of Cardiology}},
series = {{JACC: Advances}},
title = {{Clinical Decision-Making of Artificial Intelligence vs Medical Professionals in Patients With Syncope}},
url = {{http://dx.doi.org/10.1016/j.jacadv.2025.102426}},
doi = {{10.1016/j.jacadv.2025.102426}},
volume = {{5}},
year = {{2025}},
}
