Sign in to view source links and access this dataset
Description
Supra Titles 115K is a curated dataset of 115,000 chat titles designed for training models to generate concise, descriptive titles from a user's first message. It was created by SupraLabs and last updated on June 14, 2026. The dataset is derived from the training pipeline for the experimental Supra Title model family.
Use Cases
Fine-tuning language models for chat title generation based on the first user message.
Evaluating model performance on the specific task of generating concise, descriptive titles.
Training sequence-to-sequence models for a focused text transformation task.
Benchmarking models against a curated dataset designed for a single, well-defined objective.
Strengths
Contains 115,000 filtered samples, providing a substantial volume for model training.
Curated specifically for a single, well-defined task of generating titles from first messages.
Derived from a documented training pipeline for a named model family (Supra Title).
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count beyond the stated 115,000 samples is unknown, which may limit suitability assessment.
Data may reflect source bias inherent to the original chat conversations used in the training pipeline.
Provenance
Source
SupraLabs
Collection Method
Derived from the training pipeline for the Supra Title model family.
Freshness
Last updated 2026-06-14 15:24:42; freshness should be verified.
License is unknown; terms of use must be verified before application.