Sign in to view source links and access this dataset
Description
Researchscope Papers is an open dataset of computer science research papers maintained by ResearchScope. It contains metadata for 102,058 papers and 473,434 instruction-tuning rows, aggregated from sources including arXiv, OpenAlex, ACL Anthology, OpenReview, PMLR, CVF, and Semantic Scholar. The dataset was last updated on June 7, 2026.
Use Cases
Instruction-tuning large language models based on the 473,434 provided instruction-response rows.
Analyzing publication trends across major CS venues like NeurIPS, ICML, and CVPR.
Building search or recommendation systems for academic literature using paper metadata.
Training models for tasks like paper summarization or keyword extraction using the raw text.
Strengths
Contains metadata for 102,058 papers, providing a substantial corpus.
Includes 473,434 rows specifically formatted for instruction-tuning tasks.
Aggregates papers from multiple authoritative sources including arXiv, ACL Anthology, and OpenReview.
Covers papers from major venues such as NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, and AAAI.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count for the primary metadata file is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.