Qwen36 Mtp Turbo Kv Analysis: Runtime Benchmarks for Quantized Models

Name: Qwen36 Mtp Turbo Kv Analysis: Runtime Benchmarks for Quantized Models
Creator: sjakek
Published: 2026-05-15T01:28:06
Keywords: Llm Benchmark, Benchmark, Local Inference, Tabular, Qwen Model, Inference Runtime, Quantization

by sjakekUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A curated analysis artifact comparing inference runtimes for the Qwen3.6-35B-A3B model. It contains results from experiments on Windows CUDA, comparing clean MTP llama.cpp, QuinsZouls llama-next TurboQuant, and Atomic TurboQuant runs under a fixed 64k context. The dataset was created by sjakek and last updated on 2026-05-15.

Use Cases

Compare inference speed across different quantization methods based on the described TurboQuant variants.
Benchmark local LLM inference performance on Windows CUDA hardware based on the experimental setup.
Analyze the impact of context length and CPU offload settings on runtime based on the fixed 64k context and MoE CPU offload parameters.

Strengths

Benchmark runs were conducted under a fixed 64k context, MoE CPU offload, and Unsloth-aligned sampling settings, enabling direct comparison.
The repository is curated to include only completed, comparable runs, filtering out incomplete or incompatible data.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The dataset description is limited; actual data quality requires manual inspection after download.

Provenance

Source: huggingface
Collection Method: Local inference experiments on Windows CUDA.
Freshness: Last updated 2026-05-15 15:31:11; freshness should be verified.

License is unknown; terms of use must be verified before application.

Tabular Llm Benchmark Benchmark Local Inference Qwen Model Inference Runtime Quantization

Related Datasets

Quality Score

C40

Description

42

Source

42

Reputation

43

Access

26

Community

250 downloads

1 likes

0 views

Dataset Info

Author: sjakek
Created: May 15, 2026
Updated: May 15, 2026
Last synced: May 31, 2026

Access

26

Community

250 downloads

1 likes

0 views

Dataset Info

Author: sjakek
Created: May 15, 2026
Updated: May 15, 2026
Last synced: May 31, 2026

Qwen36 Mtp Turbo Kv Analysis: Runtime Benchmarks for Quantized Models

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info