Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
169,570 datasets
An animated CD-ROM product developed by GA and Skyring Environment Enterprises using Authorware software for state-of-the-art visual presentation. The Australian Ocean Data Network hosts this guide, last updated on 2026-06-04. Its content focuses on the biogeochemistry of sediments and water in Australian coastal ecosystems.
91.5 KB of raw data on laying hens performance and egg quality from a study using a feed additive combination of carrot and anchovy meal. The dataset was authored by Yuli Frita Nuningtyas and last updated on May 28, 2026. It is shared under a CC-BY-4.0 license on figshare.
Financial Incentives for Recycling Scrap Tires (FIRST) program collection and recycling data in tonnes from 1991 to 2006. The data was produced by the Government of British Columbia and covers the period before the program transitioned to Tire Stewardship BC in 2007. It is available in CSV and HTML formats under the OGL-CA-2.0 license.
Nova Scotia's Property Tax Rebate for Seniors program data, provided by the Government of Nova Scotia. It likely contains counts of applications received, approved, and denied, along with average and distribution rebate amounts, as of April 2026. The rebate covers 50% of the previous year's property tax, up to a maximum of $800.
NASA's Multi-Instrument Fused bias-corrected XCO2 dataset provides daily gridded carbon dioxide mole fraction (XCO2) and other select variables. The data is produced by applying local kriging to daily aggregates of bias-corrected observations from the OCO-2 and GOSAT satellites. This Level 4 product represents a fused, spatially interpolated view of atmospheric carbon dioxide.
A 2026 environmental summary record from the Australian Ocean Data Network describes the physical properties of the seabed. It covers the Ceduna and Eyre Sub-basins, providing a geospatial overview of the marine environment. The data is available in PDF and HTML formats.
Raw data and a codebook for the BSI project, authored by Leila Salimova and last updated on 2026-05-28. The dataset is a 35.8 KB XLSX file available under a CC-BY-4.0 license.
CADGenBench provides public inputs for a benchmark measuring AI systems' ability to produce correct 3D mechanical parts as STEP files. The dataset, created by HuggingAI4Engineering, contains only the task inputs, with ground truth withheld in a separate private repository to maintain evaluation integrity. It was last updated on June 8, 2026.
REPORTE VICTIMAS DESPLAZAMIENTO ANUALIZADO OCURRENCIA Y LLEGADA, CIFRA NACIONAL provides structured records on displacement victims in Colombia. The dataset includes columns for ethnicity (ETNIA), disability status (DISCAPACIDAD), age group (CICLO_VITAL), and event details (HECHO, EVENTOS). It is hosted on the Colombian open data portal www.datos.gov.co and was last updated on 2026-05-20.
Distribution points data published on figshare. The dataset is available as a 29.3 KB XLSX file and was last updated on June 4, 2026. It was authored anonymously for a double-blind review process.
Carrie Bow Cay, Belize, water level data collected at the Smithsonian Marine Field Station using CS450 sensor and cs475a2 Radar Gauge instrumentation. The dataset, authored by Valerie Paul and last updated in May 2026, is 22.7 MB in size and available under a CC-BY-4.0 license.
Water level data collected at the Smithsonian Marine Station in Fort Pierce, Florida, from 2023 to 2025. The 17.7 MB dataset was uploaded by Dean Janiak and last updated on May 1, 2026. Measurements were taken using OTT Compact Bubbler System and cs475a2 Radar Gauge instrumentation at coordinates 27.460121 N, 80.31134 W.
3D images and text describing Australia's Southeast Marine Region, aggregated by the Australian Ocean Data Network. The dataset was last updated on 2026-06 04 06:28:46.996714 and is available in PDF and HTML formats.
Orthophotos are georeferenced aerial images corrected for terrain deformation using a digital terrain model. The 2015 DOPs are provided by the Bundesamt für Kartographie und Geodäsie as four-channel images (RGBI). From these, panchromatic, color, and color-infrared image types can be derived.
CSIRO Marine National Facility collected oceanographic profiles of dissolved iron (DFe) at the Southern Ocean Time Series during voyages IN2018_V02 and IN2019_V02 aboard RV Investigator. Profiles were obtained using a 12-bottle trace metal rosette following GEOTRACES procedures. Full sampling and analytical details are documented in Ellwood et al., (2020a) and Ellwood et al., (2020b).
Content-free model outputs for the ARCK v0.1 leaderboard, created by IQuestLab. The dataset includes predictions and scores for tasks across tracks like General, Co-Scientist, and Bio-Designer. It was last updated on June 9, 2026.
A 2026-06-09 updated benchmark sample for end-of-turn detection in speech. The dataset is a deterministic sample of up to 400 turns per language from the full LiveKit benchmark, with languages having fewer samples including all available data. It was created by LiveKit to support the development and evaluation of models that detect when a speaker has finished a turn in conversation.
A collaborative research project flyer between Geoscience Australia and the Japan Agency for Marine-Earth Science and Technology (JAMSTEC). It aims to better understand the geology, tectonics, and paleoenvironment of the central Lord Howe Rise region. The document is a PDF flyer from 2018, hosted by the Australian Ocean Data Network.
60 episodes of robot telemetry data created using LeRobot. The dataset contains 27,140 frames recorded at 30 frames per second and is structured into chunks for training. It was authored by Gogul99 and last updated on June 15, 2026.
OD0040 Water Well Records is a dataset published by the Government of Prince Edward Island on the open_canada platform. It likely contains information about water wells, such as location and construction details. The dataset was last updated on 2026-06-03.