Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
MedHall-Bench is a field-grounded hallucination detection benchmark for medical AI assistants. It decomposes clinical responses into verifiable structured fields and evaluates AI outputs via per-field programmatic matching and sentence-level LLM-as-Judge. The dataset is designed for use with the HolyEval framework and was created by healthmemoryarena, with a last recorded update in April 2026.
License is unknown; intended for research use only and not for clinical application.