Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,673 datasets
The Netherlands national government's complete procurement data for the year 2022. The dataset details expenditure on goods and services from suppliers across all ministries and departments. It is provided by the Ministry of the Interior and Kingdom Relations and is used for internal reporting and to inform the House of Representatives.
Purchasing expenditure data from the Dutch Ministry of General Affairs for the year 2020. The dataset is compiled annually from reports submitted by all ministries and subordinate departments to the Minister of the Interior and Kingdom Relations. This information is used to inform government-wide purchasing strategies and annual reports to the House of Representatives.
Purchase expenditure data for goods and services from suppliers incurred by the national government of the Netherlands. The dataset is compiled by the Ministry of the Interior and Kingdom Relations and covers the year 2022. It is used to inform category managers and the House of Representatives on government procurement and SME participation.
2018 purchasing expenditure data from the Dutch national government, aggregated by the Ministry of the Interior and Kingdom Relations. The dataset likely contains records of expenditure on goods and services from suppliers across all ministries and departments. This information is used for internal government procurement management and to report to the House of Representatives on government spending and SME participation.
RuleDepData provides six benchmark knowledge graph datasets used by the RuleDep-ICDE2027 research project. The collection includes KG20C, WN18RR, codex-m, FB15k-237, codex-l, and YAGO3-10, each containing original train/valid/test splits and preprocessed files. The dataset was authored by 'yesun' and last updated on Hugging Face in June 2026.
Purchase expenditure data for 2019 from the Dutch national government, compiled by the Ministry of the Interior and Kingdom Relations. The dataset details spending on goods and services across all ministries and departments, used for internal category management and annual parliamentary reporting. It is published under a CC0-1.0 license on the Dutch open data portal.
Polygon-based heat vulnerability index features for Metropolitan Melbourne as of 2014. This dataset is part of the Plan Melbourne Action 91 initiative, also known as Cooling & Greening or Vegetation and Urban heat mapping. Data is provided by the Department of Transport and Planning and was last updated in April 2026.
Australia's Identified Mineral Resources 2009 provides information and analysis of mineral exploration expenditures for the calendar year 2008. The report details changes in Economic Demonstrated Resources (EDR) for 18 commodities, world rankings, and resource life estimates for major minerals. It was published by the Australian Ocean Data Network.
58,035 named variable stars in the Milky Way galaxy are cataloged in this fifth edition of the General Catalog of Variable Stars (GCVS). The catalog is compiled by researchers from the Sternberg Astronomical Institute and Institute of Astronomy (Russian Academy of Sciences) and includes data for stars discovered and named by 2021. It provides information on variability types, brightness ranges, epochs, and periods.
A sample dataset extracted and curated from Table 5 of the article with DOI 10.1016/j.thromres.2015.07.019. The dataset is 5.5 KB in size, was authored by Remya Ampadi Ramachandran, and was last updated on figshare in May 2026.
A catalog of photometric redshifts for 1694 Chandra X-ray Observatory sources in the central square degree of the COSMOS field. The dataset was created by the NASA HEASARC in November 2011 based on published research, and it includes revised redshifts for XMM-detected sources. The authors achieved an accuracy sigma[Delta-z/(1+Z_spec)] ~ 0.015 with a 5.8% outlier fraction by using a large spectroscopic training set and optimizing template libraries.
1735 photometric redshifts for X-ray sources detected by XMM-Newton across the 2 square degree COSMOS field. The catalog, released by NASA HEASARC in 2011, achieved an accuracy sigma[Delta-z/(1+z_spec)] ~ 0.015 with 5.8% outliers by using a large spectroscopic training set and deep H-band photometry. The study demonstrates how assumptions about source nature and the depth of available photometric bands influence redshift accuracy for active galactic nuclei.
Researchscope Papers is an open dataset of computer science research papers maintained by ResearchScope. It contains metadata for 102,058 papers and 473,434 instruction-tuning rows, aggregated from sources including arXiv, OpenAlex, ACL Anthology, OpenReview, PMLR, CVF, and Semantic Scholar. The dataset was last updated on June 7, 2026.
A dataset from figshare describes a method for purifying ventricular-like cardiomyocytes from human pluripotent stem cells using CD47-based cell sorting. The dataset, created by Soon-Jung Park and last updated in April 2026, includes DOCX and MP4 files totaling 78.2 MB. It documents the enhanced maturity and drug responsiveness of the purified cells, validated through immunocytochemistry, patch clamp, and multi-electrode array recordings.
The EXOSAT ME Slew Catalog contains information on 1210 X-ray sources detected by the European Space Agency's EXOSAT satellite during slew maneuvers between 1983 and 1986. Each entry includes detection time, raw 1-8 keV count rate, and position, with 80% of entries having proposed single identifications and corrected count rates. This database is a modified copy rebuilt by NASA HEASARC in July 1999.
Point data for the location of plaques in the Ballarat Avenue of Honour, each representing local men and women who served in World War One. The dataset includes attributes such as name, battalion, rank, enlistment date, and casualty status. It was published by the City of Ballarat and last updated on 2026-04-26.
Geoscience Australia's Australian Offshore Mineral Locations map shows mineral occurrences and deposits within Australia's 200 nautical mile exclusive economic zone and extended continental shelf. The map draws together data from published and unpublished marine research surveys and government records, covering minerals like manganese nodules, heavy mineral sand, phosphorites, diamonds, tin, copper, gold, and coal. It is the result of a collaborative project between Geoscience Australia, CSIRO's Wealth from Oceans Flagship, and State and Northern Territory Geological Surveys.
A replication package for the paper 'How Multidimensional Support Environments Shape Students' Contributions in OSS Summer Programs.' The package includes the survey instrument, anonymized valid survey responses, and a coding script for PLS-SEM analysis. The dataset is 155.3 KB in size and was last updated on May 18, 2026.
WorldCoder-Bench is a core task dataset for evaluating large language models on generating interactive, physically grounded 3D web scenes from natural language using Three.js. The benchmark, authored by shuolucs, was last updated on 2026-06-09. Generated programs must integrate 3D assets and obey spatial and physical constraints while keeping user controls synchronized with runtime state.
A 506-sample multimodal reasoning benchmark created by EthanSun and last updated on 2026-06-08. It evaluates vision-language models on their ability to remain faithful to task-relevant visual evidence when visually salient but answer-irrelevant distractions are added. Each sample includes original and distracted images, a question, answer choices, the correct answer, and the distraction specification.