DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Local Code Arena Starcoder 15B: MBPP Benchmark Telemetry | DataSalon

Home NLP & TextLocal Code Arena Starcoder 15B: MBPP Benchmark Telemetry

NLP & Text

Local Code Arena Starcoder 15B: MBPP Benchmark Telemetry

Name: Local Code Arena Starcoder 15B: MBPP Benchmark Telemetry
Creator: ShahzebKhoso
Published: 2026-05-28T12:43:47
Keywords: Benchmark Evaluation, Llm Telemetry, Benchmark, Tabular, Code Generation, Large Scale, Python Problems

by ShahzebKhoso·Updated 1mo ago

Available on 1 platform

Description

Raw evaluation metrics, execution telemetry logs, and structural syntax outputs from running the Mostly Basic Python Problems (MBPP) benchmark against the StarCoder 15B base model. This partition documents scaling limits of unaligned foundational weights in conversational benchmarking loops. The dataset was authored by ShahzebKhoso and last updated on 2026-05-28.

Use Cases

Benchmarking code generation models based on execution telemetry logs
Analyzing structural syntax outputs from foundational models
Establishing baselines for unaligned model weights in conversational loops
Studying scaling limits of heavyweight foundational models

Strengths

Contains raw evaluation metrics from a specific benchmark (MBPP)
Focuses on a heavyweight foundational model (StarCoder 15B)
Captures execution telemetry logs and structural syntax outputs

Limitations

Description metadata is limited; actual data quality requires manual inspection after download
Row count is unknown, which may limit suitability assessment
Column-level documentation is absent; field semantics must be inferred after download

Provenance

Source: huggingface
Collection Method: Benchmark execution telemetry captured from running the MBPP benchmark
Freshness: Last updated 2026-05-28 12:44:43; freshness should be verified

License is unknown; terms of use must be verified.

Tabular Benchmark Evaluation Llm Telemetry Benchmark Code Generation Large Scale Python Problems

Related Datasets

Quality Score

D39

Description

Source

Reputation

Quality Score

D39

Description

Source

Reputation

Access

Community

23 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 28, 2026
Updated: May 28, 2026
Last synced: Jun 6, 2026

Access

Community

23 downloads

1 likes

0 views

Dataset Info

Author: ShahzebKhoso
Created: May 28, 2026
Updated: May 28, 2026
Last synced: Jun 6, 2026

Local Code Arena Starcoder 15B: MBPP Benchmark Telemetry

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info