Sign in to view source links and access this dataset
Description
ReexpressAI created OpenVerification1, the first large-scale, open-source dataset for research on LLM output verification and uncertainty quantification. The dataset, last updated on 2026-04-25, is designed for binary classification of whether a model's response correctly addresses a given prompt or question.
Use Cases
Train binary classifiers for LLM output verification based on prompt-response pairs.
Research uncertainty quantification methods for model responses.
Benchmark the reliability of instruction-following in language models.
Develop methods to automatically detect incorrect or hallucinated model answers.
Strengths
Described as the first large-scale, open-source dataset for this specific research topic.
Provides binary labels (0/1) for classifying the correctness of model responses.
Limitations
Row count, column definitions, and file formats are unknown, limiting suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
The description is incomplete, referencing a full description on an external page.
Provenance
Source
ReexpressAI via Hugging Face.
Collection Method
Method of data gathering is not specified in the provided description.
Time Range
Temporal coverage is not specified.
Freshness
Last updated 2026-04-25 17:42:54.
Geography
Spatial coverage is not specified.
License is unknown; users must verify licensing terms before use.