DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Local Code Arena MBPP: Qwen 2.5 Coder 3B Evaluation Telemetry | DataSalon

Home Transportation & MobilityLocal Code Arena MBPP: Qwen 2.5 Coder 3B Evaluation Telemetry

Transportation & Mobility

Local Code Arena MBPP: Qwen 2.5 Coder 3B Evaluation Telemetry

Name: Local Code Arena MBPP: Qwen 2.5 Coder 3B Evaluation Telemetry
Creator: ShahzebKhoso
Published: 2026-05-23T12:00:30
Keywords: Benchmark Evaluation, Benchmark, Tabular, Code Generation, Model Telemetry, Python Problems

by ShahzebKhoso·Updated 16d ago

Available on 1 platform

Description

ShahzebKhoso's repository hosts raw evaluation metrics and execution telemetry logs from running the Mostly Basic Python Problems (MBPP) benchmark against the Qwen 2.5 Coder 3B parameter model. The data captures a specific evaluation designed to chart the transition between hyper-lightweight edge models and larger desktop-class variants. It was last updated on 2026-05-23.

Use Cases

Analyzing model performance on basic Python problems based on the MBPP benchmark.
Comparing efficiency metrics between different model size classes based on the described transition point analysis.
Studying execution telemetry and structural syntax outputs from code generation models.

Strengths

Evaluation focuses on a specific model size (3B parameters) to isolate efficiency inflection points.
Data includes raw evaluation metrics, execution telemetry logs, and structural syntax outputs.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count and dataset scale are unknown, which may limit suitability assessment.

Provenance

Source: huggingface
Collection Method: Telemetry logs from running the MBPP benchmark on the Qwen 2.5 Coder 3B model.
Freshness: Last updated 2026-05-23 12:01:56.

License is unknown; terms of use must be verified before application.

Tabular Benchmark Evaluation Benchmark Code Generation Model Telemetry Python Problems

Related Datasets

Quality Score

D39

Description

Source

Reputation

Quality Score

D39

Description

Source

Reputation

Access

Community

36 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 23, 2026
Updated: May 23, 2026
Last synced: Jun 6, 2026

Access

Community

36 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 23, 2026
Updated: May 23, 2026
Last synced: Jun 6, 2026

Local Code Arena MBPP: Qwen 2.5 Coder 3B Evaluation Telemetry

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info