Genesis AI Code 10K is a dataset of 10,000 programming examples developed by Within Us AI. It emphasizes tests-as-truth supervision, agentic loops, and evaluation thinking. The dataset is split into 9,800 training and 200 validation examples.
Use Cases
- Training AI agents for code generation based on the 'tests-as-truth supervision patterns'
- Developing agentic loops (plan→edit→test→reflect) for automated programming workflows
- Evaluating AI code generation models using the provided test patterns
- Studying tool-call trace supervision for AI agent behavior analysis
Strengths
- Dataset contains 10,000 examples, providing a substantial corpus for training
- Clear split of 9,800 training and 200 validation examples
- Focus on tests-as-truth supervision and agentic loops provides a structured evaluation framework
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
- Description metadata is limited; actual data quality requires manual inspection after download
Provenance
- Source
- WithinUsAI
- Freshness
- Last updated 2026-01-02 04:01:34; freshness should be verified