Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
154 anonymized emergency internal medicine patient cases from a single hospital in early 2025 were used to evaluate the diagnostic performance of 5 large language models against 15 emergency department junior physicians. The study, authored by Jintao Wei and shared under a CC-BY-4.0 license, found models like DeepSeek-V3 achieved 90.0% main diagnostic accuracy, outperforming physicians. Results were published on figshare in May 2026.
Primary data file is a DOCX document, which may contain formatted text and tables rather than a machine-readable data table.