Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A 525 GB unified data lake integrates seven scholarly datasets with cross-dataset DOI normalization and 13 scientific ontologies containing 1.3 million terms. It was created by J0nasW and includes a reproducible ETL pipeline. The dataset was last updated in March 2026.
The Semantic Scholar S2AG dataset is referenced by the pipeline but not included in this upload due to its API terms of service, requiring external access for full functionality.