Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A multi-task instruction-tuning dataset for Pakistani legal documents, designed for OCR correction, legal translation, and summarization tasks. The corpus is divided into six configurations, each targeting a specific language task and pair, such as repairing broken English or translating between English, Urdu, and Sindhi. It was created by author amjadali070 and last updated on Hugging Face in January 2026.
The dataset is split into six separate configurations (subsets) that must be loaded individually; the full structure and join keys are not specified.