Sign in to view source links and access this dataset
Description
Daily-updated dataset of arXiv papers from AI/ML and adjacent categories, enriched with LLM-derived signals. It includes a 0–100 importance score, topical/lab tags, a one-line takeaway, and dense full-page summaries for a selected subset. The dataset is published by author taesiri and was last updated on 2026-06-17.
Use Cases
Prioritizing daily arXiv reading based on the 0–100 LLM-derived importance score.
Categorizing papers by topic or research lab using the provided topical/lab tags.
Generating quick overviews of new research using the one-line takeaway summaries.
Conducting deeper analysis on a subset of papers using the dense full-page summaries.
Strengths
Daily updates ensure the dataset reflects the latest arXiv submissions.
Includes multiple LLM-derived enrichment layers: importance score, tags, and summaries.
Published as an open research resource powering arxivsignals.io.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Freshness should be verified as the last update timestamp is from 2026-06-17.
Provenance
Source
arXiv
Collection Method
Continuously updated and enriched with LLM-derived signals.
Time Range
Daily partitions.
Freshness
Daily updates.
License is unknown and should be verified before use.