MEDLINE/PubMed annual statistical reports detail the content and size of data elements in the baseline versions of the database for 2018-2023. The reports include total citations and occurrences per element, plus minimum, average, and maximum occurrences and lengths. The data is provided by datadiscovery.nlm.nih.gov and was last updated on 2025-06-18.
Use Cases
- Analyzing the frequency and distribution of specific metadata elements using the Element and Value columns.
- Tracking changes in average record size and element length over the 2018-2023 period using the Value column.
- Profiling the statistical characteristics of PubMed/MEDLINE baseline data for quality assessment using the Element column.
Strengths
- Covers a six-year period from 2018 to 2023.
- Provides multiple statistical measures (min, avg, max) for each data element.
- Available in multiple structured formats including CSV, JSON, XML, and RDF.
Limitations
- Row count and total dataset size are unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- datadiscovery.nlm.nih.gov
- Collection Method
- Annual statistical reports generated from baseline versions of the MEDLINE/PubMed database.
- Time Range
- 2018-2023
- Freshness
- Last updated 2025-06-18 21:07:19; freshness should be verified.