Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
144,168 datasets
Beginning in 2013, this dataset contains the winning numbers and prize amounts for the New York Lottery's Quick Draw game. The data includes details for each draw, such as the date, time, and number sequence. Columns suggest it can be used to analyze draw frequency, prize distribution, and number patterns.
Seaglider autonomous vehicles collected vertical profiles to depths of 1000 meters offshore of San Francisco during the Sub-Mesoscale Ocean Dynamics Experiment. The dataset contains in-situ measurements of temperature, dissolved oxygen, salinity, chlorophyll, and scattering to study short-scale ocean dynamics. Observations were made during two intensive field campaigns in Fall 2022 and Spring 2023.
1.6 GB of global land surface evapotranspiration data produced by the calibration-free Complementary Relationship method. The dataset covers the period from 1982 to 2016 with spatial and temporal resolutions of 0.25-degree and monthly, respectively. It was authored by Ning Ma and is available under a CC-BY-4.0 license.
Approximately 300 km offshore of San Francisco, autonomous Seagliders collected vertical profiles of temperature, dissolved oxygen, salinity, and other variables during the Sub-Mesoscale Ocean Dynamics Experiment (S-MODE). The dataset covers two intensive operating periods in Fall 2022 and Spring 2023, with gliders diving to depths of up to 1000 meters. It supports research into how short-scale ocean dynamics influence the vertical exchange of physical and biological properties.
A 15.8 MB PDF report documents rescue excavations conducted by the Starokyivska expedition in 2009 on Dytynka Hill in Kyiv. The work uncovered two burials and one reburial, likely from an Old Rus cemetery, though erosion and a lack of clear dating material complicate precise dating. The report was authored by the Starokyivska expedition of the Institute of Archaeology, NAS of Ukraine.
Global satellite observations from the VIIRS instrument aboard the Suomi NPP spacecraft. This dataset contains unpacked, raw science, calibration, and engineering data, including ephemeris, attitude, and spacecraft telemetry for 6-minute swaths. The data is provided in NetCDF5 format with a Near-Real-Time (NRT) processing level.
VIIRS/JPSS2 Raw Radiances in Counts 6-Min L1A Swath NRT contains unpacked, raw science, calibration, and engineering data from the VIIRS instrument. The product includes extracted ephemeris and attitude data from spacecraft diary packets, plus raw ADCS and bus-critical spacecraft telemetry. Users can download the VIIRS Level 1 Product User's Guide for detailed documentation.
Statistics Canada provides data on police services expenditures from 2018 to 2025. The dataset includes detailed categories for salaries, benefits, non-salary operating costs, and capital expenditures. It covers police officers, civilian personnel, and spending on vehicles, buildings, IT, and equipment.
11.4 KB of co-occurrence values for parasite pairs, calculated using all pairwise complete cases. The dataset, authored by Stephanie M. Wu, was last updated on June 4, 2026. It contains counts for four possible states of parasite presence and absence.
A dataset derived from Sentinel-3 satellite imagery provides daily large-scale surface-water monitoring across Kazakhstan from 2020 to 2024. It includes structured point-set representations of water bodies and basin-level next-day forecasts generated using a multi-factor deep learning model incorporating meteorological variables and temporal patterns. The dataset was created by Xiande Wu and shared under a CC-BY-4.0 license.
A register of contracts and grants awarded by Norfolk County Council, published as open data. The dataset is provided in CSV format and is licensed under the Open Government Licence version 3.0. It was last updated on 2026-05-28.
A June 2023 snapshot of Benefit and Collective Interest (BIC) companies registered with the Chamber of Commerce of Honda, Guaduas and North Tolima, Colombia. The dataset was published on datos.gov.co in response to a citizen request under open data initiatives. It includes columns for financials, registration dates, contact information, and company activities.
Sanitary Sewer Overflows (SSO) are releases of untreated sewage into the environment. The City of Bloomington Utilities Department records and maintains data for all SSO events within its wastewater system, with each event reported to the Indiana Department of Environmental Management. Columns suggest this dataset likely contains details on the location, volume, duration, and environmental context of each overflow event.
Lulu Liu published evaluation metrics on figshare in April 2026 for a proposed wideband signal processing framework. The framework uses a RepViT backbone network to perform detection, recognition, and key parameter extraction on signal spectrograms. Experimental results on a synthetic dataset report a maximum signal recognition rate of 82.43% and normalized root mean squared error for time and frequency parameters.
Hourly surface meteorological data from 1989 provides a cross-sectional view of weather conditions in and around the FIFE study area. The dataset includes variables such as atmospheric pressure, surface and dew point temperatures, wind speed, and cloud properties. These observations can serve as input or verification data for numerical simulation models of the atmosphere.
Michael Perlin created this dataset on June 4, 2026, as part of a pipeline to predict fungal effectors. It contains WoLFPSORT analysis results for 296 proteins from the predicted proteome of the fungus Microbotryum intermedium. Each entry is identified by a JGI Gene ID beginning with 'BQ2448_'.
Ongoing data collection from the MOPITT instrument launched aboard NASA's Terra spacecraft on December 18, 1999. This dataset contains beta version 109 monthly gridded means of carbon monoxide profiles and total column retrievals derived from near-infrared radiances, including averaging kernels. The data is produced by the Canadian Space Agency and NASA and is subject to recalibration.
Slocum glider in-situ measurements capture temperature and salinity profiles to 1000m depth during the Sub-Mesoscale Ocean Dynamics Experiment (S-MODE). The dataset supports the S-MODE goal of understanding how short-scale ocean dynamics influence vertical exchange of physical and biological variables. Data were collected approximately 300 km offshore of San Francisco during a pilot campaign in October 2021 and intensive periods in Fall 2022 and Spring 2023.
Geoscience Australia Data produced the January 2002 edition of the Magnetic Anomaly Grid of the Australian Region. This version is the first integrated onshore/offshore magnetic anomaly grid for the complete Australian margin, covering 8S - 52S, 106E - 172E with a grid cell size of 0.01 degree (approximately 1 km). The database contains 3,022,656 data points from which the marine grid was created.
Projected future flood susceptibility maps for 2050, 2070, and 2100 were generated using an XGBoost model trained on major floods from 2005 to 2023. The model, created by Natural Resources Canada, was applied to future climate scenarios under SSP 245 and SSP 585, using temperature and precipitation time series. These maps represent model projections and should be interpreted as indicators of potential flood susceptibility, not precise forecasts.