Description

PRIMUS is a pioneering collection of open-source datasets for cybersecurity LLM training. The Primus-Reasoning subset contains multiple cybersecurity reasoning tasks sourced from CTI-Bench, including CTI-RCM, CTI-VSP, CTI-ATE, and CTI-MCQ. It was augmented in June 2025 with distilled samples from DeepSeek-R1, incorporating intermediate reasoning steps and final answers.

Use Cases

Training LLMs on cybersecurity reasoning tasks based on the CTI-Bench framework mentioned in the description
Fine-tuning models for cyber threat intelligence analysis based on the dataset's stated purpose
Benchmarking LLM performance on structured cybersecurity questions based on the included task types
Studying the impact of incorporating intermediate reasoning steps from models like DeepSeek-R1 on training outcomes

Strengths

Dataset was augmented with distilled samples from DeepSeek-R1 on 2025-06-02, indicating recent updates
Focuses on multiple specific cybersecurity reasoning tasks (CTI-RCM, CTI-VSP, CTI-ATE, CTI-MCQ) as stated
Includes both intermediate reasoning steps and final answers, as described for the DeepSeek-R1 samples

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Data may reflect bias inherent to the specific CTI-Bench sources and the distillation process from DeepSeek-R1

Provenance

Source: trendmicro-ailab
Collection Method: Likely compiled from CTI-Bench tasks and augmented with distilled samples from DeepSeek-R1.
Freshness: Last updated 2025-06-02 11:27:07

License is unknown; users should verify terms before use.

Text Cybersecurity Cyber Threat Intelligence Llm Training Reasoning Tasks

Primus-Reasoning: Cybersecurity Tasks for LLM Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info