Name: Legal Hallucinations Subset: 5,444 Rows for 6 Legal Reasoning Tasks
Creator: nguha
Published: 2025-12-24T02:30:28
Keywords: Hallucination Detection, Nlp Evaluation, Text, Legal Reasoning, Legal Ai

Description

A curated subset of 5,444 rows from the reglab/legal_hallucinations dataset, containing up to 1,000 randomly sampled rows for each of six specific legal reasoning tasks. The original dataset was created for the paper 'Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models' by Dahl et al., forthcoming in the Journal of Legal Analysis. The dataset is hosted by author nguha on Hugging Face and was last updated on January 5, 2026.

Use Cases

Benchmarking LLM performance on legal reasoning tasks based on the six specific tasks mentioned.
Analyzing patterns of legal hallucinations in model outputs based on the dataset's curated examples.
Training or fine-tuning models to detect or reduce legal hallucinations based on the provided task samples.
Studying the intersection of AI and legal analysis based on the dataset's focus on legal reasoning.

Strengths

Contains 5,444 total rows, providing a substantial sample for analysis.
Includes up to 1,000 rows for each of six distinct legal reasoning tasks, enabling task-specific evaluation.
Derived from a dataset created for a peer-reviewed, forthcoming academic paper, suggesting a research-grade foundation.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count for the full original dataset is unknown, which may limit suitability assessment for larger-scale studies.
The description metadata is limited; actual data quality and task definitions require manual inspection after download.

Provenance

Source: Subset of the reglab/legal_hallucinations dataset, originally created for the paper by Dahl et al. (2024, forthcoming).
Collection Method: Curated subset containing up to 1,000 randomly sampled rows for each of 6 legal reasoning tasks.
Freshness: Last updated 2026-01-05 20:15:13; freshness should be verified.

License is unknown; users should verify terms of use before downloading.

Text Hallucination Detection Nlp Evaluation Legal Reasoning Legal Ai

Legal Hallucinations Subset: 5,444 Rows for 6 Legal Reasoning Tasks

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info