Curated instruction pairs for fine-tuning the Surrogate-1 model, focusing on DevSecOps, cloud, and the Thai market. The dataset author is axentx, and it was last updated on May 2, 2026. Sources include anonymized Claude Code transcripts, DevSecOps cron outputs, and scrubbed public GitHub code patterns.
Use Cases
- Fine-tuning instruction-following models based on DevSecOps and cloud-related prompts.
- Training models for Thai-market applications using language-specific instruction pairs.
- Conducting Direct Preference Optimization (DPO) based on the included preference pairs.
- Improving code generation and understanding based on patterns from anonymized code transcripts and GitHub repositories.
Strengths
- Data is curated from multiple specific sources, including Claude Code transcripts and DevSecOps outputs.
- Pairs are scrubbed for privacy, as noted in the description.
- Includes preference pairs specifically for Direct Preference Optimization (DPO).
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count and file formats are unknown, which may limit suitability assessment.
Provenance
- Source
- Multiple sources including Claude Code transcripts, DevSecOps cron outputs, public GitHub code, and web crawls.
- Collection Method
- Curated, filtered, and scrubbed from the listed sources.
- Time Range
- null
- Freshness
- Last updated 2026-05-02 00:06:17; freshness should be verified.
- Geography
- null