Sign in to view source links and access this dataset
Description
Galaxy Mentions is a dataset compiled by astronolan, last updated on 2026-05-20. It contains nearly every galaxy discussed in more than two sentences on the arXiv preprint server. The dataset is structured across multiple tables, including galaxy_mentions, evidence_quotes, papers, and batches.
Use Cases
Identify and catalog astronomical objects based on mentions in arXiv papers.
Study the frequency and context of galaxy discussions in scientific literature.
Extract supporting text evidence for named entity recognition in astronomy.
Analyze the processing status and extraction history of arXiv papers.
Strengths
Scope includes nearly every galaxy discussed in over two sentences on arXiv.
Structured into multiple relational tables for mentions, quotes, papers, and processing batches.
Last updated on 2026-05-20, indicating recent maintenance.
Limitations
The author notes the contents have not been thoroughly verified, requiring user skepticism.
Row count, column definitions, and file formats are unknown, limiting suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Provenance
Source
arXiv preprint server.
Collection Method
Extraction from arXiv papers, organized by processing batches.
Time Range
Covers the history of arXiv papers up to the dataset's creation.
Freshness
Last updated 2026-05-20 02:58:25.
Geography
Not applicable; data is from a global preprint repository.
License is unknown; users must verify terms before use.