SupraLabs's Supra Wild Titles 130K is a dataset series for training and evaluating chat title generation models. It contains 130,000 niche and specialized conversation samples partitioned from primary title datasets. The dataset was last updated on June 20, 2026.
Use Cases
- Train models for generating titles from niche conversations based on the specialized samples.
- Evaluate model robustness across diverse title-generation tasks based on the partitioned dataset design.
- Fine-tune language models for domain-specific chat summarization based on the described conversation samples.
Strengths
- Contains 130,000 samples, providing a substantial corpus for model training.
- Curated specifically for title generation, indicating a focused application.
- Comprises niche and specialized conversation samples, suggesting diversity in content.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- The dataset description explicitly states it is not deduplicated, which may affect data quality.
Provenance
- Source
- SupraLabs
- Collection Method
- Curated and partitioned from primary title datasets.
- Freshness
- Last updated 2026-06-20 09:51:15; freshness should be verified.