1,000 annotated German consumer contract clauses categorized by their legal validity under the German Civil Code (§§ 305-310 BGB). The dataset provides text segments from standard terms and conditions (AGB) paired with expert legal assessments regarding their compliance with consumer protection standards.
Use Cases
- Train a binary classification model to detect legally invalid clauses using the clause text and validity labels
- Develop a multi-class classifier to categorize contract segments into legal domains using the clause_type feature
- Fine-tune German-language models for legal domain adaptation using the specialized terminology found in standard terms and conditions
- Evaluate the zero-shot reasoning capabilities of large language models on German consumer protection law
Strengths
- Contains 1,000 individual clauses extracted from real-world German consumer contracts
- Annotated for legal validity based on sections 305 to 310 of the German Civil Code (BGB)
- Includes labels for specific clause categories such as liability, payment terms, and termination
- Features expert-verified ground truth for legal NLP benchmarking in the German language