Name: GLM-5.0-8000X: Formatted Prompt and Completion Pairs for AI Training
Creator: Crownelius
Published: 2026-02-16T00:08:51
Keywords: Text Generation, Prompt Completion, Text, Language Model, Ai Training

Description

GLM-5.0-8000x-formatted-fixed is a dataset of formatted text interactions, likely for training or evaluating language models. The dataset contains 4,090,360 total tokens, comprising 512,812 prompt tokens and 3,577,548 completion tokens, with an average of 261.87 tokens per row. It was uploaded by Crownelius to Hugging Face and was last updated on March 15, 2026.

Use Cases

Fine-tuning language models based on the provided prompt-completion pairs.
Benchmarking model performance on text generation tasks using the structured interactions.
Analyzing token distribution and cost efficiency for AI training pipelines.
Studying the characteristics of single-turn conversational data for model training.

Strengths

Dataset contains 4,090,360 total tokens, providing a substantial volume of text data.
Cost metrics are explicitly provided, with a total generation cost estimated at $8.60 USD.
The average tokens per row is 261.87, indicating relatively lengthy text entries.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Crownelius via Hugging Face.
Collection Method: Likely generated or formatted for AI model training, as suggested by the token and cost statistics.
Freshness: Last updated 2026-03-15 07:02:34; freshness should be verified.

License is unknown; users should verify permissions before use.

Text Text Generation Prompt Completion Language Model Ai Training

GLM-5.0-8000X: Formatted Prompt and Completion Pairs for AI Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info