Local Code Arena MBPP Qwen3 8B: Benchmark Telemetry for Code Generation

Name: Local Code Arena MBPP Qwen3 8B: Benchmark Telemetry for Code Generation
Creator: ShahzebKhoso
Published: 2026-05-23T17:50:57
Keywords: Benchmark Evaluation, Llm Telemetry, Benchmark, Tabular, Code Generation, Python Problems

by ShahzebKhosoUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Raw evaluation metrics and execution telemetry logs from running the Mostly Basic Python Problems (MBPP) benchmark against the Qwen3 8B dense foundation model. The dataset documents zero-shot functional programming synthesis performance under standard local execution bounds. It was authored by ShahzebKhoso and last updated on May 29, 2026.

Use Cases

Analyze code generation model performance based on benchmark evaluation metrics
Study execution telemetry for model inference under local bounds
Compare zero-shot functional programming synthesis results across different models
Investigate structural syntax outputs from code generation tasks

Strengths

Captures raw evaluation metrics from a standardized benchmark (MBPP)
Includes execution telemetry logs for model inference analysis
Documents performance of a specific 8-billion parameter model (Qwen3 8B)

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: huggingface
Collection Method: Likely contains telemetry from running the MBPP benchmark against the Qwen3 8B model.
Freshness: Last updated 2026-05-29 06:18:54

License is unknown; terms of use must be verified before application.

Tabular Benchmark Evaluation Llm Telemetry Benchmark Code Generation Python Problems

Related Datasets

Quality Score

D39

Description

46

Source

36

Reputation

41

Access

26

Community

43 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 23, 2026
Updated: May 29, 2026
Last synced: Jun 6, 2026

Access

26

Community

43 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 23, 2026
Updated: May 29, 2026
Last synced: Jun 6, 2026

Local Code Arena MBPP Qwen3 8B: Benchmark Telemetry for Code Generation

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info