Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
MedDialBench is a controlled factorial benchmark for evaluating large language model diagnostic robustness under parametric adversarial patient behaviors. It is an anonymous submission to the NeurIPS 2026 Datasets and Benchmarks Track, with a companion paper under double-blind review. The dataset was last updated on May 6, 2026.
The dataset is currently in an anonymized state on Hugging Face; a permanent, de-anonymized repository is planned after publication.