DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

KodCode V1 Sft R1: Synthetic Programming Problems with Verifiable Solutions | DataSalon

Home Computer Graphics & SimulationKodCode V1 Sft R1: Synthetic Programming Problems with Verifiable Solutions

Computer Graphics & Simulation

KodCode V1 Sft R1: Synthetic Programming Problems with Verifiable Solutions

Name: KodCode V1 Sft R1: Synthetic Programming Problems with Verifiable Solutions
Creator: KodCode
Published: 2025-03-01T05:04:31
Keywords: Text, Algorithmic Problems, Code Generation, Software Testing, Synthetic Data, Programming Challenges, Synthetic

by KodCode·Updated 1y ago

Available on 1 platform

Description

KodCode is a fully-synthetic open-source dataset for coding tasks, created by KodCode and last updated on March 17, 2025. It contains 12 distinct subsets spanning domains from algorithmic to package-specific knowledge and difficulty levels from basic exercises to competitive programming. The dataset is designed for supervised fine-tuning and RL tuning.

Use Cases

Supervised fine-tuning of code generation models based on the dataset's stated purpose.
RL tuning for programming agents based on the dataset's stated purpose.
Benchmarking model performance on algorithmic challenges based on the described difficulty levels.
Training models on package-specific coding knowledge based on the described domain coverage.

Strengths

Dataset contains 12 distinct subsets.
It spans various domains and difficulty levels.
Solutions and tests are described as verifiable.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Freshness should be verified as last updated 2025-03-17 07:57:30.

Provenance

Source: KodCode
Collection Method: Synthetic generation.
Freshness: Last updated 2025-03-17 07:57:30.

Text Algorithmic Problems Code Generation Software Testing Synthetic Data Programming Challenges Synthetic

Related Datasets

Quality Score

D38

Description

Source

Reputation

Quality Score

D38

Description

Source

Reputation

Access

Community

1.2K downloads

38 likes

0 views

Dataset Info

Author: KodCode
Created: Mar 1, 2025
Updated: Mar 17, 2025
Last synced: Jun 4, 2026

Access

Community

1.2K downloads

38 likes

0 views

Dataset Info

Author: KodCode
Created: Mar 1, 2025
Updated: Mar 17, 2025
Last synced: Jun 4, 2026

KodCode V1 Sft R1: Synthetic Programming Problems with Verifiable Solutions

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info