2,676,075 Chinese criminal law documents categorized by law articles, charges, and sentencing terms. The dataset provides textual case descriptions paired with structured metadata for legal judgment prediction tasks.
Use Cases
- Train a multi-label classification model to predict criminal charges using the 'fact' text and 'accusation' labels
- Develop a regression model to estimate the length of a sentence using the 'term_of_imprisonment' field
- Build an information retrieval system to identify 'relevant_articles' based on natural language case descriptions
Strengths
- 2,676,075 criminal law documents sourced from the Supreme People's Court of China
- Labels for 183 distinct law articles and 202 different criminal charges
- Includes a 'fact' column containing detailed textual descriptions of case circumstances
- Provides 'term_of_imprisonment' data for regression-based sentencing prediction