Local Code Arena Starcoder2 7B: MBPP Benchmark Telemetry

Name: Local Code Arena Starcoder2 7B: MBPP Benchmark Telemetry
Creator: ShahzebKhoso
Published: 2026-05-28T13:33:59
Keywords: Benchmark Evaluation, Llm Telemetry, Benchmark, Tabular, Code Generation, Python Problems

by ShahzebKhosoUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Raw evaluation metrics, execution telemetry logs, and structural syntax outputs from running the Mostly Basic Python Problems (MBPP) benchmark against the StarCoder2 7B base model. The dataset documents behavioral dynamics of mid-tier foundational weights in automated conversational evaluation workflows. It was authored by ShahzebKhoso and last updated on May 28, 2026.

Use Cases

Analyze model performance on basic Python problems based on the MBPP benchmark results.
Study execution telemetry and behavioral dynamics of foundational models mentioned in the description.
Compare structural syntax outputs from automated conversational evaluation workflows.
Benchmark the StarCoder2 7B model against other code generation models.

Strengths

Captures raw evaluation metrics, execution telemetry logs, and structural syntax outputs.
Documents behavioral dynamics of a modern, mid-tier foundational model (StarCoder2 7B).
Focuses on a standardized benchmark (Mostly Basic Python Problems).

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.

Provenance

Source: huggingface
Collection Method: Telemetry captured from running the MBPP benchmark against the StarCoder2 7B model.
Freshness: Last updated 2026-05-28 13:38:13; freshness should be verified.

License is unknown; terms of use must be verified before application.

Tabular Benchmark Evaluation Llm Telemetry Benchmark Code Generation Python Problems

Related Datasets

Quality Score

D39

Description

46

Source

36

Reputation

40

Access

26

Community

24 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 28, 2026
Updated: May 28, 2026
Last synced: Jun 6, 2026

Access

26

Community

24 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 28, 2026
Updated: May 28, 2026
Last synced: Jun 6, 2026

Local Code Arena Starcoder2 7B: MBPP Benchmark Telemetry

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info