Kimi-K2.5-Reasoning-1M-Cleaned: Unified Schema for Supervised Fine-Tuning

Name: Kimi-K2.5-Reasoning-1M-Cleaned: Unified Schema for Supervised Fine-Tuning
Creator: Jackrong
Published: 2026-04-17T05:44:24
Keywords: Text Generation, Text, Reasoning, Synthetic Data

by JackrongUpdated 2mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Jackrong created this cleaned derivative of the ianncity/KIMI-K2.5-1000000x dataset, last updated on April 17, 2026. It preserves the original four-config layout and rewrites each record into a unified reasoning-SFT schema with fields like conversations, input, output, domain, and meta. The dataset is intended for supervised fine-tuning, with the teacher model KIMI-K2.5 recorded in the metadata.

Use Cases

Supervised fine-tuning of language models based on the unified reasoning-SFT schema.
Training models on structured conversational reasoning based on the 'conversations' field.
Analyzing reasoning patterns across different domains based on the 'domain' field.
Benchmarking model performance against a known teacher model based on the 'meta.teacher_model' metadata.

Strengths

Derived from a source dataset containing 1,000,000 records.
Provides a cleaned and unified schema with fields like id, conversations, input, output, domain, and meta.
Records the specific teacher model (KIMI-K2.5) used in the metadata.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count for the cleaned version is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: ianncity/KIMI-K2.5-1000000x
Collection Method: Cleaned derivative preserving original layout and rewritten into a unified schema.
Freshness: Last updated 2026-04-17 16:27:02; freshness should be verified.

Text Text Generation Reasoning Synthetic Data

Related Datasets

Quality Score

D37

Description

42

Source

36

Reputation

38

Access

26

Community

3 likes

0 views

Dataset Info

Author: Jackrong
Created: Apr 17, 2026
Updated: Apr 17, 2026
Last synced: Jun 28, 2026

Access

26

Community

3 likes

0 views

Dataset Info

Author: Jackrong
Created: Apr 17, 2026
Updated: Apr 17, 2026
Last synced: Jun 28, 2026

Kimi-K2.5-Reasoning-1M-Cleaned: Unified Schema for Supervised Fine-Tuning

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info