Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
142,359 datasets
December 1 2010 to October 30 2022 results from a 4km-resolution biogeochemistry and sediments model of the Great Barrier Reef, forced by a hydrodynamic model and a baseline catchment scenario. The dataset is produced by the eReefs project and hosted by the Australian Ocean Data Network. The model configuration is identified as gbr4_H4p0_ABARRAr2_OBRAN2020_FG2Gv3_B4p2_Cq5b_Dhnd.
284 Chinese cities' panel data from 2011 to 2022 was used to analyze the effect of New Urbanization (NEU) on urban green innovation (GI). The study constructs two-way fixed-effects and multivariate moderation models, finding statistically significant positive effects. The dataset, authored by Liang Fang and licensed under CC-BY-4.0, contains regression results supporting this analysis.
Australian Ocean Data Network provides a scientific note on submarine phosphorite deposits. The document focuses on deposits off California and Mexico, which are believed to be of immense size and economically exploitable. It suggests the continental shelf south of Australia as a logical area for exploration based on similar formation conditions.
Part of the Nova Scotia Topographic Database (NSTDB), this layer contains road, trail, and rail line features. Data is updated and maintained from aerial photography and field-collected GPS, with inspections verifying attributes like surface type and lane count. The dataset is provided by data.novascotia.ca and was last updated on 2026-05-05.
Rivers Strahler Ranking is a geospatial dataset classifying river segments in Northern Ireland using the Strahler stream order system. The dataset defines the hierarchical branching structure of a river network, where the order increases when two streams of the same order merge. It is published by the Government Digital Service under an open license.
Two large Multivariate Time-Series (MTS) databases from the aviation domain, each containing several millions of observations, are used to benchmark search algorithms. The data was provided by the National Aeronautics and Space Administration and the associated paper was last updated on 2026-04-10. The research focuses on a flexible subsequence search framework that enables querying on any subset of variables with arbitrary time delays.
Peatland ACTION Completed restoration project site centroids provide geographic centroids for completed peatland restoration projects in Scotland. The data is represented as polygons created by buffering spatial data from on-site restoration techniques, offering a visual representation of project size and shape. This dataset is provided by the Scottish Government SpatialData.gov.scot and was last updated on 2026-06-02.
Polygon data representing the geographic extent of completed peatland restoration projects in Scotland. The Scottish Government SpatialData.gov.scot provides these footprints, which are created by buffering spatial data from on-site restoration techniques. Users can interact with the polygons to view detailed project information, such as financial year end and project area size.
Global ocean data provides daily 1-degree gridded estimates of water vapor in the marine atmospheric boundary layer beneath uniform cloud fields. The dataset is derived from microwave radiometry (AMSR-E/AMSR-2) and near-infrared imagery (MODIS) to calculate the vapor between the surface and cloud top. Version 2 uses an improved methodology to screen out high clouds.
Marie-Annick Moreau uploaded an audio recording titled 'Explanation of the 'Ng'ongole' song' to figshare on June 3, 2026. The 26.9 MB WAV file contains women explaining the meaning of a song that asks God to bring peace to politicians from the President to local leaders, so they have faith in MAM's intentions and allow her to come to Tanzania. The dataset is licensed under CC-BY-NC-SA-4.0.
39.6 KB of interview data in EAF format, authored by Marie-Annick Moreau and last updated on 2026-06-03. The data captures a group interview where Abdalah Saidi Mwingo describes the initial steps of erecting a fence using stakes, and Lumolumo explains the meaning of his prayer and opening ritual.
2024 inventory of administrative procedures and services provided by the Department of Boyacá, Colombia. The dataset includes 90 procedures and one Other Administrative Procedure (OPA) registered in the Unified Information System for Procedures (SUIT), with 17 included in the 2024 Procedure Rationalization Strategy. It is published by www.datos.gov.co and was last updated on May 18, 2026.
Registered City Lobbyists is a public record of individuals and firms required to register with the Los Angeles City Ethics Commission. The dataset tracks lobbyist registration details, including names, associated firms, registration dates, and contact information. It supports transparency and oversight of lobbying activities directed at city officials.
VIIRS/NPP Nadir BRDF-Adjusted Reflectance Daily L3 Global 1 km SIN Grid Near Real Time (NRT) provides Nadir BRDF-Adjusted Reflectance (NBAR) estimates at 1-kilometer resolution. The product is generated daily using a 16-day rolling window of VIIRS data and employs the RossThick/Li-Sparse-Reciprocal BRDF model to correct for view-angle effects. It includes 18 Science Dataset layers for quality assessment and nadir reflectance across nine VIIRS moderate bands.
NASA's Delta-X project provides a model simulating island and secondary channel evolution in river deltas, focusing on sediment dynamics, erosion, and water level changes. The dataset includes model code and outputs for the Mississippi River Delta and nine other major global deltas. Files are provided in MATLAB and NetCDF formats.
Lottery results for the retired Cash 4 Life game from the New York Lottery. The dataset includes draw dates, winning numbers, and cash ball numbers from 2014 through 2026. It is published by data.ny.gov and was last updated in May 2026.
Cambridgeshire County Council data details where public funds are allocated for home care services for older people, broken down by electoral ward. The dataset includes figures for the 2015/16 and 2017/18 financial years. Its presence on multiple government data platforms indicates its use for public spending transparency.
North American Forest Dynamics (NAFD) products map the primary cause of forest canopy cover loss across the conterminous United States from 1986 to 2010. The dataset contains four raster layers derived from Landsat imagery, classifying change events such as fire, removals, stress, wind, and conversion. It also includes layers for event year and model confidence metrics, depicting the greatest magnitude disturbance per pixel over the 25-year period.
Provisional data reported by EU Member States under the Urban Waste Water Treatment Directive (UWWTD) for the 13th reporting period. The dataset includes tables on agglomerations, treatment plants, discharge points, and sludge handling, compiled by the European Environment Agency. Data from Romania and Slovenia are not included in this provisional version.
A spreadsheet from the Environment Agency models potential percentage reductions to water abstraction licenses needed to meet environmental flow requirements for water bodies across England. The data provides a national-scale starting point for planning, indicating the scale of future change abstractors may need to consider. Version 1.3 includes additional water bodies and corrections from September 2025.