Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
32,950 model queries evaluated nine large language models on structured electronic health record tasks. The dataset, authored by Eyal Klang and last updated in May 2026, contains results from a study sampling 50,000 emergency department visits to test prompting strategies like direct, chain-of-thought, and tool-based code generation.
License is CC-BY-4.0, requiring attribution. File format is XLS, requiring compatible software.