Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
JudgmentBench contains 30 real-world legal tasks with model-generated outputs at three quality levels. The dataset includes rubric scores and pairwise judgments from practicing lawyers, as well as annotations from GPT-5.4 and GPT-5.4-mini autograders. It was created by judgmentbench and last updated on May 7, 2026.
License is unknown; restrictions should be verified before use.