45 courses and over 200 source documents form a benchmark for grounded synthesis. The dataset includes line-level citation ground truth from professional educators and programmatic video output in React code. Pairwise human preferences provide expert votes on output quality as a signal for reinforcement learning.
Use Cases
- Benchmarking multi-stage grounded synthesis pipelines based on source documents and structured educational content.
- Evaluating citation grounding and attribution models based on line-level source attribution.
- Training or evaluating video generation models based on programmatic React code output.
- Studying human preference signals for reinforcement learning based on pairwise expert votes.
Strengths
- 45 courses provide a structured educational content base.
- Line-level citation ground truth is provided by professional educators.
- Pairwise human preferences from experts offer a direct RLHF signal.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
Provenance
- Source
- metaphilabs
- Collection Method
- Likely gathered from professional educators and synthesized into a benchmark.
- Freshness
- Last updated 2026-03-14 11:04:32; freshness should be verified.