DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

DeepSeek-V4-Pro Reasoning Traces for Model Distillation | DataSalon

Home Media & CommunicationDeepSeek-V4-Pro Reasoning Traces for Model Distillation

Media & Communication

DeepSeek-V4-Pro Reasoning Traces for Model Distillation

Name: DeepSeek-V4-Pro Reasoning Traces for Model Distillation
Creator: beyoru
Published: 2026-04-24T03:51:49
Keywords: Reasoning Traces, Text, Ai Training, Model Distillation, Prompt Samples, Synthetic

by beyoru·Updated 1mo ago

Available on 1 platform

Description

500 reasoning traces and final answers generated by DeepSeek-V4-Pro (reasoning_effort=max, thinking.enabled=true). The prompts were sampled from the first 500 rows of the train split of Jackrong/GLM-5.1-Reasoning-1M-Cleaned dataset. The dataset was created by beyoru and last updated on Hugging Face in April 2026.

Use Cases

Training smaller reasoning models based on the detailed reasoning traces.
Evaluating the quality and consistency of a large model's reasoning process.
Analyzing the relationship between prompt complexity and generated reasoning steps.
Creating benchmark datasets for model distillation techniques.

Strengths

Contains 500 samples, providing a substantial preview set.
Traces are generated by a state-of-the-art model (DeepSeek-V4-Pro) with maximum reasoning effort enabled.
Prompts are sourced from a known, cleaned reasoning dataset (GLM-5.1-Reasoning-1M-Cleaned).

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is limited to 500 samples, which may not be sufficient for full-scale training.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Hugging Face dataset created by beyoru.
Collection Method: Generated by DeepSeek-V4-Pro model processing prompts from another dataset.
Freshness: Last updated 2026-04-24 06:50:40; freshness should be verified.

Text Reasoning Traces Ai Training Model Distillation Prompt Samples Synthetic

Related Datasets

Quality Score

C42

Description

Source

Reputation

Quality Score

C42

Description

Source

Reputation

Access

Community

6 downloads

2 likes

0 views

Dataset Info

Author: beyoru
Created: Apr 24, 2026
Updated: Apr 24, 2026
Last synced: Apr 27, 2026

Access

Community

6 downloads

2 likes

0 views

Dataset Info

Author: beyoru
Created: Apr 24, 2026
Updated: Apr 24, 2026
Last synced: Apr 27, 2026

DeepSeek-V4-Pro Reasoning Traces for Model Distillation

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info