Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Raw evaluation metrics, execution telemetry logs, and structural syntax outputs from running the Mostly Basic Python Problems (MBPP) benchmark against the StarCoder 15B base model. This partition documents scaling limits of unaligned foundational weights in conversational benchmarking loops. The dataset was authored by ShahzebKhoso and last updated on 2026-05-28.
License is unknown; terms of use must be verified.