Sign in to view source links and access this dataset
Description
A 2024 dataset from RegLab contains queries, raw LLM outputs, and correct responses analyzed in Dahl et al., 'Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models'. The dataset likely supports research into the accuracy and reliability of large language models in legal contexts. It is a public subset, with additional queries held in a separate reserve file.
Use Cases
Benchmarking LLM accuracy in legal domains based on query-response pairs.
Analyzing patterns of factual hallucination in AI-generated legal text.
Developing methods to detect and mitigate incorrect information in LLM outputs.
Training or evaluating models for legal question-answering tasks.
Strengths
Dataset is directly linked to a forthcoming 2024 academic publication in the Journal of Legal Analysis.
Each record contains a query, an LLM response, and a verified correct response for comparison.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
RegLab (author listed as 'reglab').
Collection Method
Queries and responses collected for academic analysis.
Freshness
Last updated 2024-06-04 20:50:48; freshness should be verified.
License is unknown; users should verify terms before use.