Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
167,789 datasets
Quarterly samples from June 2017 to June 2019 provide dissolved lead (Pb) concentrations and isotope ratios from estuarine waters in Galveston Bay, Texas. The dataset was contributed by author Kim, Yerim and harvested by the Texas Data Repository. It is hosted on the Dataverse platform.
Legacy product - no abstract available. This report documents geological work conducted during the relief voyage of the M.S. Thala Dan between December 1961 and March 1962. It is published on data_gov_au by the Australian Ocean Data Network.
514,597 bullet comments were collected from videos posted between January 2021 and March 2026. The data retrieval focused on videos related to second foreign languages, minority languages, and multilingual learning. Author Chenyu Wang published this dataset on figshare under a CC-BY-4.0 license.
Plate 9 from AGSO Bulletin 136 provides a map of sediment data for the Scott Reef continental shelf. The dataset is published by the Australian Ocean Data Network on data.gov.au. The record was last updated on 2026-06-17.
90,000 web-sourced images re-hosted as individual JPEG files for browser access. The dataset includes a manifest with columns for image URLs, source URLs, captions, and dimensions. It is part of a larger series of approximately 131 repositories maintained by Neomi26 on Hugging Face.
A dataset created using the LeRobot framework, likely containing sequences of robotic arm actions. The data includes 6-dimensional action features for joints such as shoulder, elbow, wrist, and gripper positions. It was authored by 'imstevenpmwork' and last updated on June 16, 2026.
Colombian data on applicants for scholarships to train researchers in doctoral programs abroad, forming a bank of eligible candidates from 2010 to 2016. The dataset is hosted by datos.gov.co and was last updated on 2026-05-18. It includes applicant demographics, academic fields, and potential host institutions.
OSNI Open Data - BenchMarks - Height provides point data showing the location of survey benchmarks and their associated height values in metres above mean sea level. The dataset is published by the Government Digital Service for OpenData under an Open Government Licence. The data is not maintained, and users are advised to use a Global Navigation Satellite System (GNSS) and the latest geoid model for orthometric height in Northern Ireland.
An inventory registry for tracking the quantity and details of files within an archive. The dataset includes columns for contract numbers, series codes, storage units, and consultation frequency. It originates from the Colombian open data portal www.datos.gov.co and was last updated on May 18, 2026.
Signaling data from studies on D2-like dopamine receptor activation of G proteins. The dataset is a 327.8 KB XLSX file authored by Cesare Orlandi and shared under a CC-BY-4.0 license. It was last updated on June 2, 2026.
D3R signaling selectivity data relates to studies on D2-like dopamine receptor activation of G proteins. The dataset is a 327.8 KB XLSX file authored by Cesare Orlandi and shared under a CC-BY-4.0 license. It was last updated on June 2, 2026.
415.3 KB of source data underpinning the main figures in the manuscript 'High-fidelity electrical detection of spin transport in graphene'. The data was authored by Yijie Lin and is available under a CC-BY-4.0 license. It was last updated on June 2, 2026, on the figshare platform.
Faculty characterization data for professors linked to the Institución Universitaria Mayor de Cartagena. The dataset includes columns for contract type, academic titles, workload distribution, and entry dates. It is provided by www.datos.gov.co and was last updated on 2026-05-18.
Alaska's lakes and ponds are mapped in this spatially explicit dataset showing seasonal fluctuations. It contains over 800,000 water bodies larger than 0.001 km², derived from Sentinel-2 imagery at 10-meter resolution for the 2016-2021 ice-free seasons. The dataset was produced by ORNL_CLOUD using a U-Net model and an adaptive NDWI threshold algorithm in Google Earth Engine.
161.0 KB of supplementary data on point counting methods for the Helsby Sandstone Formation. The dataset, authored by Xiang Yan, is available in XLSX format under a CC-BY-4.0 license and was last updated on June 2, 2026. It supports a source-to-sink synthesis of a Middle Triassic river system in the British Isles.
Russian-language text data designed for evaluating Personally Identifiable Information detection and Named Entity Recognition systems. The dataset was created by redmadrobot-rnd and last updated on 2026-06-09. It targets guardrail and anonymization pipelines that must find personal data and Russian identity-document numbers.
2021 to 2030 monthly estimates of hillslope cover erosion, measured in tonnes per hectare per month, for the state of New South Wales. The data was published by the NSW Department of Climate Change, Energy, the Environment and Water and is available under a CC-BY-4.0 license. The dataset was last updated on the platform in May 2026.
31.6 KB of dragonfly assemblage data collected during a typical wet year and a below-average dry year in northern KwaZulu-Natal, South Africa. The dataset was authored by Charl Deacon and last updated on June 1, 2026. It is available in CSV format under a CC-BY-4.0 license.
Metaphlan profiles for all 4,845 samples from the 16 studies. The dataset contains 4,845 samples aggregated from 16 distinct studies. It was authored by Jean-Sebastien Gounot and last updated on 2026-05-25.
Proportion of employers reporting a skills shortage vacancy as a percentage of all employers in London. The dataset is provided by the Greater London Authority and was last updated on June 24, 2026. It likely contains aggregated survey or administrative data on labor market challenges.