Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
141,875 datasets
WFP's Automated Disaster Analysis and Mapping (ADAM) system provides geospatial data for a Category 1 storm event centered near latitude 15.7, longitude -103.1 in Mexico on June 8, 2025. The dataset is produced by the World Food Programme's operational system for analyzing sudden-onset humanitarian emergencies. It was last updated on May 21, 2026, and is available in SHP and CSV formats under a CC-BY-SA-4.0 license.
ADAM ID 1001194 documents a tropical storm event from August 07 to August 13, 2025, with its center near latitude 24.3, longitude 116.9. The dataset is produced by the World Food Programme's Automated Disaster Analysis and Mapping (ADAM) system for humanitarian emergency response. It was last updated on May 21, 2026.
A dataset containing genetic mutation contexts and derived allele counts from 1,000,000 haplotype samples. It includes columns for mutation type, ancestral and changed trinucleotide contexts, CpG site methylation levels, and derived allele counts. The associated mut_rates.csv file provides point mutation rate estimates for each unique context triplet, with analysis code available on GitHub.
Daily data from the MODIS instrument on NASA's Terra satellite provides geolocation fields for each 1 km sample across all orbits. The dataset includes geodetic latitude, longitude, surface height, solar and satellite viewing angles, and a land/sea mask, calculated using spacecraft attitude, orbit, and a digital elevation model. This Level 1A product is a foundational input for numerous downstream MODIS land and atmosphere data products.
SPURS-2 field campaign data provides drop counts as a function of size for rain drops larger than 0.44 mm in diameter, collected from a ship in 2016 and 2017. This dataset includes associated rain rates and liquid water content estimates, supporting the study of near-surface salinity processes. Measurements are from an ODM-470 disdrometer, with a known limitation in detecting small drops below the 0.44 mm threshold.
U.S. National Ice Center Arctic Sea Ice Charts and Climatologies in Gridded Format provides monthly statistical summaries of Arctic sea ice concentration derived from operational ice charts. The dataset covers a 35-year period from 1972 through 2007, offering median, quartile, and frequency of occurrence values on a 25-km EASE-Grid. This product is superseded by a newer version with data from January 2003 onward.
VIIRS/NPP BRDF/Albedo Quality Daily L3 Global 500 m SIN Grid NRT (VNP43IA2N) Version 1 provides quality metrics for Bidirectional Reflectance Distribution Function (BRDF) and albedo at a 500-meter spatial resolution. The product is generated daily using a 16-day rolling window of VIIRS data and employs the RossThick/Li-Sparse-Reciprocal (RTLSR) kernel-driven BRDF model. Its 11 Science Dataset layers support the calculation of black-sky and white-sky albedo, enabling correction for surface anisotropic effects.
Global daily land surface data provides Bidirectional Reflectance Distribution Function (BRDF) and Albedo quality metrics at a 1 km resolution. NASA produces this product using 16-day rolling windows of observations from the VIIRS/NPP satellite, applying the RossThick/Li-Sparse-Reciprocal kernel-driven model. The dataset contains 23 Science Dataset layers for moderate resolution bands, including quality flags, observation days, and land cover classifications.
Surface flux measurements from a single upland site during the 1987 FIFE study. Data were collected daily from June 26 to October 17, 1987, using eddy correlation and Bowen ratio techniques to determine sensible heat, latent heat, and carbon dioxide fluxes. The dataset provides direct observations of turbulent diffusive fluxes above a grazed prairie surface.
Brightness temperature measurements from the Hurricane Imaging Radiometer (HIRAD) instrument, a passive C-band microwave radiometer flown on NASA aircraft. The dataset captures observations of Hurricanes Earl and Karl during the Genesis and Rapid Intensification Processes (GRIP) experiment in September 2010. Its primary goal was to measure ocean surface winds through heavy rain to study tropical storm formation and intensification.
195 countries are covered in the full AI readiness data, sourced from the 2025 Government AI Readiness Index. The matched panel for predictive modeling contains AI readiness and Sustainable Development Goal (SDG) scores for 167 countries, based on the 2025 Sustainable Development Report. Pureheart Ogheneogaga Irikefe prepared this dataset in 2026 to support research on AI-SDG relationships.
180 rights-cleared short scripted clips form a benchmark for subtitle translation and localization workflows. The package contains 540 timestamped source subtitle segments, 1,080 aligned translation rows, and 540 SRT files across English, Spanish, and Chinese (Simplified). Authored by imaz regi for the AI Translate Video project, this synthetic dataset was last updated in May 2026.
Continental US tidal wetland data provides a 30-meter resolution vertical resilience index (VR) for 2100 under three IPCC RCP scenarios (2.5, 4.5, 8.5). The dataset includes future tidal area projections, sediment accretion rates, and land cover classifications for 1996 and 2011. It originates from ORNL_CLOUD and is available on multiple government data platforms.
TLC authorized For-Hire vehicles that are active, with daily updates between 4–7 PM. The dataset includes fields for vehicle identification, agent information, and status, such as DMV License Plate Number, Vehicle VIN Number, Agent Name, and Current Status. It serves as an official registry for medallion taxis and limousines operating in New York City.
Medallion Drivers - Active is a daily-updated list of New York City Taxi and Limousine Commission (TLC) medallion drivers who are currently active and in good standing. The dataset includes driver names, license types, and expiration dates, with timestamps for the last update. It is published by the City of New York and is available on multiple government data platforms.
515 anisotropy of magnetic susceptibility (AMS) measurements from a 724.1-meter stratigraphic section in the Western Sichuan foreland basin. The data, collected by Lifu Hou, spans an interval from approximately 128 to 64 million years ago, constrained by a published magnetostratigraphic age model. It underpins time-resolved analysis of magnetic fabrics and shortening directions in the Longmen Shan tectonic belt.
A retrospective comparison of 100 treatment-naïve hypertensive patients assesses the concordance between ChatGPT-4's antihypertensive recommendations and real-world physician prescriptions. The study, authored by Kadri Murat Gürses and posted on figshare in 2026, found substantial agreement (κ = 0.67) and an association between concordant treatment and higher rates of short-term blood pressure control.
A 22.4 MB collection of supplementary materials for a statistical testing procedure for functional time series. The methodology was applied to Canadian yield curve data and French sub-national age-specific mortality data, with findings suggesting integration orders of one or fractional. Author Won-Ki Seo published the materials on figshare under a CC-BY-4.0 license in June 2026.
A 97.7 KB PDF research paper authored by Eduardo Y. Sakabe, last updated on 2026-05-29. The paper investigates the learning dynamics of binarized neural networks (BNNs) through the lens of algorithmic information theory and the Block Decomposition Method (BDM). It proposes a framework for complexity-aware learning and regularization, supporting the view of training as a process of algorithmic compression.
Ruolan Xiong developed a machine learning model for storm surge forecasting using data from 42 tropical cyclones between 2000 and 2023. The dataset likely contains model outputs and inputs, including tropical cyclone parameters and observed surge data from nine stations in the Pearl River Estuary. It was last updated on 2026-05-19 and is shared under a CC-BY-4.0 license.