Sign in to view source links and access this dataset
Description
171,640 high-quality instruction–response pairs derived from Indian legal texts, focusing on statutory interpretation and structured legal explanations. The dataset, created by kaushik-harsh-99, is designed for instruction tuning of language models and represents a significant scale upgrade from version 1, which contained 33,077 samples. The dataset page was last updated on 2026-05-01.
Use Cases
Instruction tuning of language models based on legal reasoning patterns mentioned in the description
Benchmarking model performance on structured legal explanation tasks based on the described instruction–response format
Training models for statutory interpretation based on the focus on Indian legal texts
Improving model clarity and structure in text generation based on the dataset's design emphasis
Strengths
Contains 171,640 samples, a significant increase from the 33,077 samples in version 1
Designed with an emphasis on clarity, structure, and legal reasoning patterns for instruction tuning
Derived from Indian legal texts, providing domain-specific content
Limitations
Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download
Provenance
Source
Derived from Indian legal texts, as stated in the description.
Collection Method
Likely involves processing and structuring legal texts into instruction–response pairs, though the exact method is not detailed.
Freshness
Last updated 2026-05-01 17:52:40; freshness should be verified
Geography
India, based on the description's focus on Indian legal texts.
License is unknown; users should verify terms of use before downloading.