Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Cybersecurity document corpus and question-answer pairs designed for evaluating Retrieval-Augmented Generation (RAG) systems. Developed by Manoel Malon Costa de Moura and hosted on Harvard Dataverse, the collection supports the RRAG search methodology as of March 2026. It contains two distinct subsets: an ingestion dataset for document indexing and an evaluation dataset for performance testing.
The dataset is specifically designed to support the RRAG (Novel Approach for Searching in Cybersecurity Documents) methodology; users should refer to the associated research for implementation context.