Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
168,281 datasets
EOS Aura Microwave Limb Sounder monthly binned nitric acid data provides near-global atmospheric profiles from August 2005 to the present. The data is derived from 240 GHz and 190 GHz radiometer measurements, with a spatial resolution of 4° latitude by 5° longitude and a useful vertical range from 215 to 1.47 hPa. This product is archived in netCDF4 format by NASA's Goddard Earth Sciences Data and Information Services Center.
OTTER project data contains field measurements of tree dimensions, including height, crown width, DBH, and height-to-crown distance. The data was collected using variable-radius plot sampling along transects with a steel tape and compass. Columns suggest a focus on quantifying forest structure and biomass.
Southern Ocean data from a 1976 research cruise over the magnetic quiet zone south of Australia. The dataset is an observer's report from the Vema cruise 33 leg 3, conducted between 20 January and 19 February, 1976. It is a legacy product published via the Australian Ocean Data Network with no abstract available.
Agent execution trajectories from a large-scale agentic evaluation. Each trajectory captures a single (agent, model) attempt at a task, including step logs, tool calls, model outputs, and verifier scoring. The dataset was authored by kendx and last updated on 2026-06-08.
ML3MBHNO3 is a NASA EOS Aura Microwave Limb Sounder (MLS) dataset providing monthly binned nitric acid (HNO3) mixing ratios. Data coverage spans from August 2004 to the present, with near-global spatial coverage from -82 to +82 degrees latitude. The data is archived in netCDF4 format and includes profile and column data on assorted vertical grids.
Samples of Temperature and Salinity from a CTD profile have been measured at Station 9 in the Derwent Estuary between August 2012 and January 2013. The data is provided by the Australian Ocean Data Network and was last updated in June 2026. It includes profile data likely captured at regular intervals over the six-month period.
Official determination of the circumstances causing human death, which can be recorded on a death certificate. The dataset includes columns for Age, Cause of Mortality, and Gender. It is hosted by www.datos.gov.co and was last updated on 2026-05-18.
August 2012 to January 2013 samples of temperature and salinity from CTD profiles measured at Station 2 in the Derwent Estuary. The data is provided by the Australian Ocean Data Network and is available in multiple formats including NetCDF.
LockerNYC locations represent a pilot program offering free 24/7 access to secure public lockers for package delivery and pickup in New York City. The dataset is hosted by data.cityofnewyork.us and was last updated on 2026-05-13. It includes locker attributes and their geographic coordinates.
SeaWiFS, launched in August 1997, collected global ocean color data for over 13 years until December 2010. The instrument on the OrbView-2 satellite had 8 spectral bands and was optimized for ocean measurements with features like fore-and-aft tilt and low polarization sensitivity. This dataset from NASA provides binned Photosynthetically Available Radiation (PAR) measurements at 4 km resolution.
Incauciones de Basuco records the kilograms of basuco, a cocaine-based drug, seized by the Colombian Public Force. The dataset includes columns for municipality (MUNICIPIO), department (DEPARTAMENTO), date of seizure (FECHA HECHO), and quantity (CANTIDAD). It is published by www.datos.gov.co and was last updated on 2026-05-19.
Socrata dataset from datos.gov.co lists official campuses of non-certified municipalities in Colombia's Valle del Cauca department. Columns suggest details on campus names, addresses, operational status, and precise latitude/longitude coordinates. The dataset was last updated on May 18, 2026.
NovelPrompts is an English safety dataset built to test LLM judges' behavior when evaluating prompt safety. It contains 194 prompts and completions designed to require understanding of novel concepts appearing after July 2025. The dataset was created by anissa218 and last updated on Hugging Face in June 2026.
Valle del Cauca, Colombia, provides cadastral map data for land parcels under the UAEC jurisdiction. The dataset includes geographic information in Shapefile format, with urban areas mapped at 1:500 or 1:1000 scale and rural areas at 1:5000 or 1:10000 scale. It was last updated on 2026-05-18 and is hosted by the Colombian open data portal.
ReefCloud public data supports a predictive model for annual hard coral cover across the central Great Barrier Reef. The model aggregates monitoring observations into 5x5km hexagonal units and incorporates exposure to heat stress and tropical cyclones. This dataset is provided by the Australian Ocean Data Network for reproducing a use case from a 2026 paper.
Henry D. Kalter's dataset, last updated on 2026-05-22, examines associations between perceived illness severity at onset and outcomes for fatal illnesses in neonates and infants aged 1-11 months. The 17.5 KB XLS file likely contains tabular data on age at death and formal care-seeking behaviors for illnesses that began in community settings. Its specific focus is on the initial perception of severity and its relationship with mortality and healthcare utilization.
A figshare dataset by Brent D. Mishler, last updated May 22, 2026. It contains counts of highly endemic branches for three categories (neo-endemic, meso-endemic, paleo-endemic) across seven comparison datasets and a primary Acacia dataset. The data is stored in a 5.5 KB XLS file.
Germany's land cover data from the CORINE Land Cover 5ha (CLC5) project for 2015. The dataset is transformed for the INSPIRE theme Land Cover and provided via a Web Feature Service (WFS) by the Bundesamt für Kartographie und Geodäsie.
Descriptive statistics of positional Shannon entropy within viral genotypes. The dataset includes metrics such as the number of analyzed positions, mean, median, standard deviation, quantiles, and conservation index. It was authored by Alan López Leal and last updated on May 15, 2026.
Gene-level entropy summaries derived from consensus alignments across Human Papillomavirus genotypes. The dataset includes mean entropy, median entropy, interquartile range, median absolute deviation, and percentages of conserved and highly variable positions. Author Alan López Leal published the 9.8 KB XLSX file on figshare under a CC-BY-4.0 license, last updated on 2026-05-15.