Jamescalam provides a selection of papers on artificial intelligence and large language models sourced from arXiv. The dataset was last updated on January 29, 2024. A version 2 dataset is referenced, which reportedly improves data quality and size.
Use Cases
- Analyze research trends in AI based on the selection of papers.
- Study the evolution of LLM topics based on the paper collection.
- Benchmark NLP models using the corpus of AI research papers.
- Perform citation network analysis based on the relationships between papers.
Strengths
- A version 2 dataset exists, which the author states improves data quality.
- The dataset was updated on January 29, 2024, indicating some maintenance.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
Provenance
- Source
- arXiv
- Collection Method
- Selection of papers by the author.
- Freshness
- Last updated 2024-01-29 11:16:32; freshness should be verified.