Sign in to view source links and access this dataset
Description
Ascend-CoT-v3-json is a dataset for supervised fine-tuning of models for custom operator development on Ascend C and CANN platforms. It contains cleaned Chain-of-Thought-style samples for kernel implementation, tiling logic, API usage, debugging, and operator-development reasoning. The dataset, created by AscendKernelGen, is organized into two final SFT subsets within a single repository.
Use Cases
Fine-tuning language models for Ascend C kernel implementation based on CoT-style samples.
Training models to generate or reason about tiling logic for custom operators.
Developing AI assistants for debugging Ascend C and CANN API code.
Enhancing models' understanding of operator-development workflows and reasoning.
Strengths
Dataset is specifically designed for supervised fine-tuning, indicating a structured training purpose.
Data is described as 'cleaned', suggesting some level of quality control.
Release is organized into two distinct SFT subsets, providing a clear structure.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
AscendKernelGen
Collection Method
Likely created as part of the research for the paper 'AscendKernelGen: A Systematic Study of LLM-Based Kernel Generation for Neural…'.
Freshness
Last updated 2026-06-03 14:01:36; freshness should be verified.
License is unknown; users should verify permissions before use.