20,690 question-answer pairs derived from 113 Central Acts of India, compiled by SharathReddy. The dataset, named Vidhaan, was built specifically to address the 'context-splitting' problem in legal retrieval-augmented generation (RAG) applications. It was last updated on March 24, 2026.
Use Cases
- Fine-tuning language models for Indian legal question-answering based on the QA pairs.
- Benchmarking retrieval-augmented generation systems for legal documents based on the described context-splitting problem.
- Training models to understand and answer precise queries about Indian Central Acts based on the instruction-context structure.
Strengths
- Contains 20,690 QA pairs, providing a substantial volume of training examples.
- Derived from 113 Central Acts of India, offering a focused legal domain.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Freshness should be verified as the last update date is March 24, 2026.
Provenance
- Source
- SharathReddy
- Collection Method
- Derived from 113 Central Acts of India.
- Geography
- India