BC Finance LLM Golden Set: Regression and Hallucination Test Cases

Name: BC Finance LLM Golden Set: Regression and Hallucination Test Cases
Creator: BCCard
Published: 2026-06-29T07:56:49
Keywords: Regression Testing, Financial Qa, Hallucination Detection, Llm Evaluation, Tabular, Quality Assurance

by BCCardUpdated 5d ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A collection of 550 test cases for evaluating large language models in financial contexts. It was produced by BC Card and Yonsei University DSL as part of the S2026 LLMOps project. The dataset includes 300 regression test cases, 200 financial edge cases for hallucination detection, and 50 hard negative QA samples.

Use Cases

Conduct regression testing for LLM updates based on the 300-case regression golden set.
Detect hallucinations in financial LLM outputs using the 200 edge cases.
Evaluate model robustness with hard negative samples from the 50-case QA set.
Implement a deployment gate by setting score thresholds against the provided test suites.

Strengths

Contains 550 total curated test cases across three specific subsets.
Includes 200 financial edge cases explicitly designed for hallucination detection.
Designed for a concrete industrial-academic collaboration (BC Card-Yonsei University S2026 LLMOps project).

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count for individual files is provided, but total data scale and feature details are unknown.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: BC Card and Yonsei University DSL (산학협력)
Collection Method: Created as an output of the S2026 LLMOps project.
Freshness: Last updated 2026-06-29 08:06:05; freshness should be verified.

Licensed under Apache 2.0.

Tabular Regression Testing Financial Qa Hallucination Detection Llm Evaluation Quality Assurance

Related Datasets

Quality Score

D33

Description

27

Source

39

Reputation

35

Access

26

Community

1 likes

0 views

Dataset Info

Author: BCCard
Created: Jun 29, 2026
Updated: Jun 29, 2026
Last synced: Jul 5, 2026

Access

26

Community

1 likes

0 views

Dataset Info

Author: BCCard
Created: Jun 29, 2026
Updated: Jun 29, 2026
Last synced: Jul 5, 2026

BC Finance LLM Golden Set: Regression and Hallucination Test Cases

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info