Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Raw evaluation metrics and execution telemetry logs from running the Mostly Basic Python Problems (MBPP) benchmark against the Qwen3 8B dense foundation model. The dataset documents zero-shot functional programming synthesis performance under standard local execution bounds. It was authored by ShahzebKhoso and last updated on May 29, 2026.
License is unknown; terms of use must be verified before application.