Name: Indian Legal Data V2: 171,640 Instruction-Response Pairs for Legal Language Models
Creator: kaushik-harsh-99
Published: 2026-05-01T15:42:33
Keywords: Legal Text, Text, Indian Law, Natural Language Processing, Statutory Interpretation

Description

171,640 high-quality instruction–response pairs derived from Indian legal texts, focusing on statutory interpretation and structured legal explanations. The dataset, created by kaushik-harsh-99, is designed for instruction tuning of language models and represents a significant scale upgrade from version 1, which contained 33,077 samples. The dataset page was last updated on 2026-05-01.

Use Cases

Instruction tuning of language models based on legal reasoning patterns mentioned in the description
Benchmarking model performance on structured legal explanation tasks based on the described instruction–response format
Training models for statutory interpretation based on the focus on Indian legal texts
Improving model clarity and structure in text generation based on the dataset's design emphasis

Strengths

Contains 171,640 samples, a significant increase from the 33,077 samples in version 1
Designed with an emphasis on clarity, structure, and legal reasoning patterns for instruction tuning
Derived from Indian legal texts, providing domain-specific content

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: Derived from Indian legal texts, as stated in the description.
Collection Method: Likely involves processing and structuring legal texts into instruction–response pairs, though the exact method is not detailed.
Freshness: Last updated 2026-05-01 17:52:40; freshness should be verified
Geography: India, based on the description's focus on Indian legal texts.

License is unknown; users should verify terms of use before downloading.

Text Legal Text Indian Law Natural Language Processing Statutory Interpretation

Indian Legal Data V2: 171,640 Instruction-Response Pairs for Legal Language Models

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info