Name: ResearchScope Papers: 32,784 Computer Science Research Papers from arXiv and Conferences
Creator: kishormorol
Published: 2026-06-03T09:33:23
Keywords: Computer Science, Conference Proceedings, Text, Arxiv, Research Papers, Academic Publications

Description

32,784 computer science research papers are aggregated from multiple academic sources, including arXiv, conferences, and journals. The dataset is maintained by ResearchScope and updated automatically via GitHub Actions. It includes splits for per-source analysis, instruction-tuning, and per-section fine-tuning.

Use Cases

Training language models for academic text generation based on the paper abstracts and full texts.
Building paper recommendation systems based on metadata and content from arXiv and conference sources.
Fine-tuning models for specific sections of research papers using the per-section splits.
Analyzing trends in computer science research across different publication venues.
Instruction-tuning models for tasks like summarization or question-answering on scientific literature.

Strengths

Contains 32,784 papers, providing a substantial corpus for analysis.
Includes papers from specific sources: 7,784 from arXiv, 20,000 from conferences, and 5,000 from journals.
Offers structured splits for per-source, instruction-tuning, and per-section fine-tuning tasks.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count for specific splits is unknown, which may limit suitability assessment.
Data may reflect source bias inherent to the selected arXiv, conference, and journal publications.

Provenance

Source: ResearchScope, aggregating from arXiv, conferences, and journals.
Collection Method: Updated automatically via GitHub Actions.
Freshness: Last updated 2026-06-13 08:10:37; freshness should be verified.

Text Computer Science Conference Proceedings Arxiv Research Papers Academic Publications

ResearchScope Papers: 32,784 Computer Science Research Papers from arXiv and Conferences

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info