Sign in to view source links and access this dataset
Description
Hunter-Alpha-SFT-300000x is a dataset of 308,000 reasoning traces distilled from the Hunter Alpha model via OpenRouter. The dataset contains 1.2 billion tokens and is distributed across mathematics (30%), coding (30%), science (15%), computer science (15%), and creative writing (10%). The dataset was created by ansulev and last updated on March 26, 2026.
Use Cases
Fine-tuning language models for mathematical problem-solving based on the algebra, calculus, and probability traces.
Training models on code generation and understanding based on traces covering web development, C++, Java, and other languages.
Improving model reasoning in scientific domains using traces from physics, chemistry, and biology.
Enhancing creative writing capabilities in language models based on the dedicated creative writing portion.
Strengths
Contains 308,000 individual reasoning traces.
Comprises 1.2 billion tokens of training data.
Covers a defined distribution of topics including mathematics, coding, and science.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
Distilled from the Hunter Alpha model on OpenRouter.
Collection Method
Likely contains model-generated reasoning traces used for supervised fine-tuning.
Freshness
Last updated 2026-03-26 01:12:28; freshness should be verified.