Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
168,374 datasets
S2_Supplemental material error analysis contains procedures used for error analysis of model parameters, along with corresponding parameter and error tables. The document was authored by Tong Su and is available under a CC-BY-4.0 license. It was last updated on June 1, 2026.
Descriptions and results of X-ray diffraction and grain size measurements are provided in a 4.8 MB DOCX file. The dataset was authored by Tong Su and last updated on June 1, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
agentlans/advanced-readability-analysis provides syntactic and lexical complexity features calculated from English text snippets. The dataset is derived from the training split of the agentlans/readability dataset and was last updated on June 15, 2026. It is designed to help researchers study factors influencing reading difficulty beyond traditional formulas.
An inventory of datasets from the opendata.maryland.gov portal that tracks their update status and metadata completeness. The dataset includes columns for Dataset Name, Unique Identifier, Owner, Update Frequency, and Date of Most Recent Change. It was last updated on 2026-05-29 09:31:32.
Data corresponding to each plot in the manuscript 'Reassessing magnetic tunnel junction detectability for ultrasensitive sensing using small-field sensitivity and Jiles–Atherton modeling'. Each file is a simple CSV, titled to correspond each data set to each figure. Benjamin Brown from Brown University Dataverse published this replication data, which was last updated on June 22, 2026.
Collated new Non-native species records for UK from 2003-2014 documents initial observations of non-indigenous species in the Celtic Seas and Greater North Sea region. The dataset includes the date and location for each first record, providing a temporal and spatial account of species introductions. It covers a 12-year period from January 2003 to December 2014.
From 2000 to 2002, this dataset contains the abundance of macrofauna identified to species level from benthic samples in the Western North Sea. Samples were collected by Cefas using 0.1m2 Day or Hamon grabs, primarily in spring and early summer of 2000. Macrobenthos density was derived from the enumerated species counts.
UK estimates for recreational sea angling participation, catch, and economic impact from 2016 onward. The data is modelled from user responses collected via the Sea Angling Diary app. The Marine Environmental Data & Information Network is the listed organization.
An inventory of physical assets managed by the Tibasosa Municipal Mayor's Office. Data originates from the SINFA financial and administrative software's Warehouse and Inventory module. The dataset was last updated on 2026-05-18.
Four multimodal deep imbalanced regression benchmarks released for evaluating the CCC-GRPO reinforcement learning method. The dataset was created by Yao Du, Shanshan Song, and Xiaomeng Li for a paper accepted at ICML 2026. The training splits follow naturally long-tailed target distributions.
Budget execution data for the Housing and Habitat Secretariat of Colombia's Valle del Cauca department, covering multiple fiscal years. The dataset includes columns for initial budget, definitive appropriations, executed amounts, payments, and obligations. It was last updated on 2026-05-18 and is hosted on the Colombian open data portal.
Datos.gov.co hosts information on live births in Guadalajara de Buga for the year 2025. The dataset includes 30 columns covering maternal and infant health, demographics, and birth circumstances. It was last updated on May 18, 2026.
REGISTROS ACTIVOS ACTIVIDAD AGRICULTURA Y CULTIVOS JURISDICCION CCV contains registration records for active non-profit entities in the jurisdiction of the Villavicencio Chamber of Commerce. The data, sourced from www.datos.gov.co, includes entities whose last renewal year was 2022 and whose reported activities are related to agriculture and cultivation. It was last updated on May 18, 2026.
Job Seekers Allowance numbers for Camden Lower Super Output Areas. The dataset was published by the London Borough of Camden and covers the period from December 2014 to November 2015. It is hosted on the uk_data platform.
A fleet list from the London Borough of Camden, published on the uk_data platform. The data includes vehicle and fuel type information as of May 2018. The last recorded update was on 2026-07-02 07:12:52.633371.
OmniVideo-Test is a human-verified evaluation set introduced in the paper 'OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains'. It contains 505 multiple-choice question-answer pairs based on raw video files, serving as a companion to the larger OmniVideo-100K dataset. The dataset was created by MiG-NJU and was last updated on June 15, 2026.
A 2.7 GB high-resolution digital master copy of manuscript HC.MS.01764 from the Qatar National Library Heritage Collection. The dataset contains a digitized manuscript of Part 13 of the Quran, last updated on 2026-05-19. It is provided by Qatar National Library under a CC0-1.0 license.
A 2.9 GB high-resolution digital master copy of manuscript HC.MS.01765 from the Qatar National Library Heritage Collection. The dataset, published by Qatar National Library, contains a digitized version of the 14th part of the Quran. The record was last updated on 2026-05-19 11:47:25.
A 3.3 GB high-resolution digital master copy of manuscript HC.MS.01766 from the Qatar National Library Heritage Collection. The dataset, released under a CC0 1.0 license, contains the sixteenth part (juz') of the Quran. It was last updated on the platform in May 2026.
A high-resolution digital master copy of manuscript HC.MS.01763 from the Qatar National Library Heritage Collection. The dataset is a 2.9 GB ZIP file containing a digitized manuscript of the Quran, Part 2. It was published by Qatar National Library under a CC0 1.0 license and last updated in May 2026.