Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
157,580 datasets
NASA's MEaSUREs program provides a daily record of global landscape freeze/thaw status at 6 km resolution. The data is derived from microwave radiometer observations by JAXA's AMSR-E and AMSR2 instruments. This dataset is maintained by NASA and is available on multiple government platforms.
Geoscience Australia's Science Principles document outlines six foundational principles guiding its scientific work. The principles, which include Relevance to Government and Quality Science, are embedded into the agency's long-term strategic planning and daily operations. The document, published by the Australian Ocean Data Network, was last updated on May 5, 2026.
Embodied-R1.5-SFT-Dataset is a subset of the Stage 1 Supervised Fine-Tuning data used to train the Embodied-R1.5 model. The dataset is hosted by author IffYuan on HuggingFace, with a partial release noted as of June 9, 2026. The full dataset is described as a work in progress, with JSON files still being uploaded.
1,507 episodes of robot demonstrations for the Tower of Hanoi puzzle, comprising 3,264,454 frames at 60 frames per second. The dataset was created using LeRobot and is hosted on Hugging Face by the author jellyho. It was last updated on 2026-06-13.
Global lightning signatures were detected from visible channel imagery by the Defense Meteorological Satellite Program (DMSP) Operational Linescan System (OLS) flown on satellite F12. The dataset contains extracted time and location data for each lightning streak, stored in monthly HDF files. It was produced by the National Aeronautics and Space Administration and covers a seven-month period from May through November 1995.
1.6 MB of mass spectrometry data from a study identifying photoproducts of the antibiotic sulfamethoxazole (SMX) under environmentally relevant UV irradiation (300–350 nm). The dataset was authored by Pavla Fojtíková and last updated in June 2026. Files are provided in XML and MGF formats.
NSIDC satellite data aids investigations of variability and trends in sea ice cover. It provides measurements of sea ice concentration, extent, ice-covered area, persistence, and monthly climatologies. The dataset is produced by the National Aeronautics and Space Administration (NASA).
Contract disclosure reports for the Department of Sport, Racing and Olympic and Paralympic Games for the first two quarters of the 2025-26 financial year. The data was published by the Queensland Government's Sport, Racing and Olympic and Paralympic Games organization and was last updated on May 29, 2026. It is available as a CSV file under a Creative Commons Attribution 4.0 license.
Queensland Corrective Services publishes monthly counts of specific incident types within custodial centers for the year 2020. The dataset likely contains tabular time-series data tracking incidents over 12 months. It is available under a Creative Commons license and is published on multiple government data platforms.
Data from data.colorado.gov lists all state liaisons (lobbyists) and the years they were registered with the Colorado Department of State (CDOS). The dataset includes lobbyist names, contact information, associated state agencies, and registration status. It was last updated on 2026-05-29 11:09:30.
SERBOT navigation episodes collected for OmniVLA-style vision-language-action experiments. The dataset contains 82 valid episodes, 4,343 frames, and 4,342 actions across five distinct tasks. It was created by kjw6213 and last updated on June 4, 2026.
December 2018 to April 2019 model results from a 4km-resolution regional-scale biogeochemistry and sediments simulation of the Great Barrier Reef. The dataset, part of the eReefs simulation suite, represents a hindcast run with a pre-industrial catchment scenario, forced by a hydrodynamic model and specific catchment inputs. It serves as a comparative benchmark alongside baseline and reduced-load catchment scenarios.
Estadísticas Solicitudes Demanda shows all restitution requests (lawsuits) filed by the Administrative Unit for Dispossessed and Abandoned Lands before specialized land restitution judges, as mandated by Law 1448 of 2011. The dataset is published by www.datos.gov.co and was last updated on 2026-05-18. It likely contains counts of legal demands broken down by regional office, territorial office, year, and month.
Annual averages from 2006 to 2015 present wage information for employees in Canada and its provinces. The dataset, a customization of Statistics Canada data, includes average hourly and weekly wage rates broken down by immigrant status, industry, type of work (full- and part-time), and sex. It is published by the Government of Alberta.
Annual average wage data from 2006 to 2015 for Canada and provinces, customized from Statistics Canada. It presents average hourly and weekly wage rates for employees categorized by type of work, immigrant status, industry, and sex. The dataset is provided by the Government of Alberta.
3,022,656 data points form the January 2002 edition of the first integrated onshore/offshore magnetic anomaly grid for the complete Australian margin. The grid covers 8°S to 52°S and 106°E to 172°E with a cell size of 0.01 degree, approximately 1 km, and values are in nanoTesla (nT). It was created by combining levelled and unlevelled marine data sectors with an earlier onshore grid, though mismatches exist at some onshore/offshore joins.
A 2022 inventory of information assets published by ICFES, the Colombian Institute for Educational Evaluation. The registry lists available records, their formats, and access points for public use. It is hosted on the Colombian open data portal, datos.gov.co.
Marco Vinicio Alban-Paccha published an eligibility matrix for healthy control cohorts in remote mobile app and wearable sensor sub-studies on figshare. The matrix details common inclusion criteria and sub-study-specific exclusions. The dataset is a 5.5 KB Excel file last updated on May 21, 2026.
Graham Elliott's replication dataset for the paper 'Combining Forecasts - On Why Averaging Beats Optimal Linear Weights'. The 168.3 MB collection includes code and data files supporting the analysis of the forecast combination puzzle. The dataset was last updated on 2026-04 27 and is shared under a CC-BY-4.0 license.
Individual monthly hillslope cover erosion rates, measured in tonnes per hectare per month, are provided for the state of New South Wales for the year 2017. The dataset is published by the NSW Department of Climate Change, Energy, the Environment and Water under a CC-BY-4.0 license. It was last updated on 2026-05-18.