Sign in to view source links and access this dataset
Description
June 2026 submissions from 8 frontier coding models, including Claude Opus 4.8 and GPT-5.5, autonomously writing CUDA/Triton GPU kernels. Each model had one unlimited-time run per problem to write the fastest kernel for an NVIDIA RTX PRO 6000 Blackwell GPU, graded as peak_fraction of the hardware roofline. The dataset was created by Infatoshi and hosted on Hugging Face.
Use Cases
Benchmarking AI model performance on low-level GPU kernel generation based on the described autonomous coding task.
Analyzing the relationship between model architecture and kernel optimization efficiency based on the peak_fraction grading metric.
Comparing the code quality and speed of different frontier models like Claude Opus 4.8 and GPT-5.5 on identical hardware problems.
Studying autonomous agent behavior in constrained, performance-critical programming environments based on the unlimited-time generation setup.
Strengths
Features submissions from 8 named frontier AI models, providing a direct comparison point.
Benchmarks performance on a specific, real hardware target (NVIDIA RTX PRO 6000 Blackwell SM120).
Uses a clear, quantitative grading metric (peak_fraction of hardware roofline).
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
Hugging Face user Infatoshi.
Collection Method
Autonomous generation by AI models on the KernelBench-Hard platform.
Time Range
Submissions generated in June 2026.
Freshness
Last updated 2026-06-15 22:15:16; freshness should be verified.
License is unknown; terms of use must be verified before application.