Loading...
Loading...
General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites
166,603 datasets
NASA's AVIRIS Images dataset contains hyperspectral imagery acquired in 1992 over ACCP sites. Pixels coinciding with field plots were extracted to correlate spectral reflectance with estimated canopy carbon and nitrogen content. The data supports research into vegetation biochemistry and remote sensing model development.
New South Wales land cover and management data collected monthly throughout 2024. The dataset is provided by the NSW Department of Climate Change, Energy, the Environment and Water and was last updated in May 2026. It likely contains satellite-derived imagery and classifications of vegetation and land use.
From May 2013 to November 2018, sensors collected bioaccumulation data on oysters in Rodds Bay as part of the Port Curtis Integrated Monitoring Program. The data was collected by the Australian Ocean Data Network. The dataset is hosted on the data.gov.au platform.
Colombian National Police schools' performance on the state ICFES test from 2017 to 2020. The dataset includes the categorical classification (A+ to D) and national ranking position for each school for each year. It was published on the Colombian open data portal, datos.gov.co, and last updated in May 2026.
Monthly Hillslope Cover Erosion (t.ha-1.month-1) over New South Wales for 2007. The dataset is provided by the NSW Department of Climate Change, Energy, the Environment and Water. It was last updated on 2026-05-18.
An inspection report for selecting a domestic water bore site on Bathurst Island. The document describes a low, flat laterite plateau over Cretaceous mudstone, noting the mudstone is at least 1,000 feet thick and a poor aquifer. Recommendations include deepening existing wells and drilling radial horizontal holes to intersect water-bearing cavities in the laterite, which is slightly more than ten feet thick in most places.
Montreal's data set contains service requests, complaints, and comments submitted to the city via the 311 call system, email, and Accès Montréal Offices. The data represents all submitted requests, regardless of whether they resulted in a city intervention. It is published by the Government and Municipalities of Québec under a CC-BY-4.0 license and was last updated on April 17, 2026.
From December 2006 to June 2014, sensors deployed in Port Curtis collected sediment data as part of the Port Curtis Integrated Monitoring Program (PCIMP). The data originates from Zone 08, Mid Harbour, and is hosted by the Australian Ocean Data Network. The dataset's specific measurements and volume are not detailed in the available metadata.
2006 monthly soil erosion rates measured in tonnes per hectare per month across hillslopes in New South Wales. The dataset was produced by the NSW Department of Climate Change, Energy, the Environment and Water and is available in PDF and GEOTIFF formats.
Unbiased Ventures developed a transparent methodology for scoring startup pitch decks. The framework evaluates decks across 8 dimensions with stage-aware weighting and benchmarks against 6,586 companies. The dataset documents this methodology, last updated in June 2026.
Australia's marine geology and phosphorite resources are the focus of this legacy report from the Australian Ocean Data Network. The document likely contains evaluations and recommendations for a national marine geology program. It was last updated on 2026-06-16.
Individual Monthly Hillslope Cover Erosion (t.ha-1.month-1) over New South Wales for 2012. The data is provided by the NSW Department of Climate Change, Energy, the Environment and Water and is available in PDF and GEOTIFF formats.
BC Parks and Protected Areas established from 1911 to 2011, including year of establishment and size in hectares. The dataset supports the 'Trends in Ecosystem Protection' indicator published by Environmental Reporting BC. It is provided by the Government of British Columbia.
EPM Aguas Nacionales and the CREG share indices of classified and reserved information as stipulated by Colombia's Transparency and Access to Information Law 1712 of 2014. The datasets likely contain records detailing the legal classification status of public information, including responsible parties, justification, and duration. These indices serve as instruments for public information management.
The BC Address Geocoder is a REST API provided by the Government of British Columbia. It resolves physical locations of addresses and place names in British Columbia to latitude and longitude coordinates. The service also offers address correction, reverse geocoding, intersection location, and parcel identification.
Indicators track the progress of the 2020-2023 indicative plans for management reporting by sectoral entities in the Department of Boyacá, Colombia. The data includes partial progress up to November 2023 and was supplied by the Secretariat of Planning. The dataset was last updated on the platform in May 2026.
From 1911 to 2011, this dataset tracks the area coverage of biogeoclimatic zones within established British Columbia Parks and Protected Areas. It contains the results reported by Environmental Reporting BC in a 2012 indicator summary. The data is provided by the Government of British Columbia.
Historical public data from Deutsche Bahn, the largest train company in Germany. The dataset includes monthly processed files containing train schedules, delays, and cancellations from stations across the country. It is maintained by the author 'piebro' on Hugging Face, with a last recorded update in June 2026.
Recent marine sedimentation on the continental shelf south of Lae, New Guinea is a dataset published by the Australian Ocean Data Network on data_gov_au. The dataset likely contains information about sediment deposits in a specific marine region. No abstract or detailed metadata is available, as it is listed as a legacy product.
Warden-01 is a manually curated dataset of 1,500 penetration testing sessions for training autonomous bug bounty hunting agents. It was created by author yamura4 and last updated on June 18, 2026. The dataset is structured in OpenAI SFT format, containing messages for system, user, assistant, and tool roles.