Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A research dataset by Jia-Hui Luo, last updated in May 2026, evaluating systematic error patterns of large language models in clinical reasoning for low back pain. It contains 103 multiple-choice questions and 30 clinical scenario questions used to evaluate five LLMs across six dimensions, including safety and completeness. The study includes results from targeted prompt engineering interventions designed to remediate high-risk errors.
License is CC-BY-4.0. Data is packaged in a ZIP file; contents require extraction.