Name: KernelBench-Hard: Frontier AI Model GPU Kernel Submissions
Creator: Infatoshi
Published: 2026-06-15T22:15:12
Keywords: Triton, Cuda, Performance Benchmark, Ai Agents, Tabular, Gpu Kernels

Description

June 2026 submissions from 8 frontier coding models, including Claude Opus 4.8 and GPT-5.5, autonomously writing CUDA/Triton GPU kernels. Each model had one unlimited-time run per problem to write the fastest kernel for an NVIDIA RTX PRO 6000 Blackwell GPU, graded as peak_fraction of the hardware roofline. The dataset was created by Infatoshi and hosted on Hugging Face.

Use Cases

Benchmarking AI model performance on low-level GPU kernel generation based on the described autonomous coding task.
Analyzing the relationship between model architecture and kernel optimization efficiency based on the peak_fraction grading metric.
Comparing the code quality and speed of different frontier models like Claude Opus 4.8 and GPT-5.5 on identical hardware problems.
Studying autonomous agent behavior in constrained, performance-critical programming environments based on the unlimited-time generation setup.

Strengths

Features submissions from 8 named frontier AI models, providing a direct comparison point.
Benchmarks performance on a specific, real hardware target (NVIDIA RTX PRO 6000 Blackwell SM120).
Uses a clear, quantitative grading metric (peak_fraction of hardware roofline).

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Hugging Face user Infatoshi.
Collection Method: Autonomous generation by AI models on the KernelBench-Hard platform.
Time Range: Submissions generated in June 2026.
Freshness: Last updated 2026-06-15 22:15:16; freshness should be verified.

License is unknown; terms of use must be verified before application.

Tabular Triton Cuda Performance Benchmark Ai Agents Gpu Kernels

KernelBench-Hard: Frontier AI Model GPU Kernel Submissions

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info