10,000 supervised examples for classifying email and email-adjacent content. Each JSONL row contains an instruction plus text input and a structured JSON output with fields for triage, priority, and risk. The dataset was created by weijianzhg and was last updated on HuggingFace in May 2026.
Use Cases
- Fine-tuning classifiers for phishing and spam risk assessment based on email content.
- Evaluating models for operational triage decisions based on priority and risk scores.
- Training systems to filter prompt-attack content based on email and security-review text fragments.
Strengths
- 10,000 supervised examples provide a substantial training corpus.
- Structured JSON output includes six specific fields: triage, priority, risk, should_process, confidence, and reason.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
Provenance
- Source
- huggingface
- Freshness
- Last updated 2026-05-31 06:03:18; freshness should be verified.