20,000+ test cases across 20 distinct legal tasks categorized into three cognitive levels: legal knowledge, legal reasoning, and legal application. The dataset evaluates LLMs on Chinese legal professional exams, case analysis, and document generation.
Use Cases
- Evaluate the legal reasoning capabilities of LLMs using the 'Legal Reasoning' task category
- Train models for automated legal judgment prediction using the 'Criminal Charge Prediction' and 'Prison Term Prediction' labels
- Benchmark model performance on professional legal standards using the 'National Unified Legal Professional Qualification Examination' subset
- Develop legal document summarization tools using the 'Court View Generation' task data
Strengths
- Includes 20 sub-tasks such as Legal Provision Prediction, Criminal Charge Prediction, and Court View Generation
- Organized into three cognitive dimensions: Knowledge (memorization), Reasoning (logic), and Application (practical use)
- Contains data derived from the National Unified Legal Professional Qualification Examination and official court documents
- Supports evaluation across zero-shot and few-shot prompting scenarios