Local Code Arena Starcoder2 15B: MBPP Benchmark Telemetry

Name: Local Code Arena Starcoder2 15B: MBPP Benchmark Telemetry
Creator: ShahzebKhoso
Published: 2026-05-28T14:25:25
Keywords: Benchmark Evaluation, Llm Telemetry, Benchmark, Tabular, Code Generation, Python Problems

by ShahzebKhosoUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

ShahzebKhoso hosts raw evaluation metrics, execution telemetry logs, and structural syntax outputs from running the Mostly Basic Python Problems (MBPP) benchmark against the StarCoder2 15B base model. The dataset captures telemetry from conversational evaluation loops to establish a baseline for unaligned foundational weights. It was last updated on May 28, 2026.

Use Cases

Benchmarking code generation models based on MBPP problem-solving performance.
Analyzing execution telemetry logs to understand model behavior during evaluation.
Studying structural syntax outputs for error patterns in generated code.
Establishing baseline performance for unaligned foundational language model weights.

Strengths

Focuses on a specific, widely-used benchmark (MBPP) for Python code generation.
Captures multiple telemetry data types: raw metrics, execution logs, and syntax outputs.
Provides a baseline for the StarCoder2 15B model, a known foundational model.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.

Provenance

Source: Author ShahzebKhoso on Hugging Face.
Collection Method: Telemetry captured from running the MBPP benchmark against the StarCoder2 15B model.
Freshness: Last updated 2026-05-28 14:27:28; freshness should be verified.

License is unknown, which may restrict usage.

Tabular Benchmark Evaluation Llm Telemetry Benchmark Code Generation Python Problems

Related Datasets

Quality Score

D39

Description

46

Source

36

Reputation

40

Access

26

Community

26 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 28, 2026
Updated: May 28, 2026
Last synced: Jun 6, 2026

Access

26

Community

26 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 28, 2026
Updated: May 28, 2026
Last synced: Jun 6, 2026

Local Code Arena Starcoder2 15B: MBPP Benchmark Telemetry

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info