Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,578 datasets
Geospatial data from the Government of Ontario lists certified examiners authorized for non-emergency on-farm slaughter under Ontario's Meat Regulation. The dataset includes KMZ files for Google Earth and shapefiles for GIS software. It was last updated on April 17, 2026.
The Water quantity vulnerability indicators dataset estimates a vulnerability index for Water Survey of Canada HYDAT gauges in the Southwestern Hudson Bay and Nelson River watershed systems in Ontario. The Provincial Mapping Unit of the Ministry of Natural Resources produced this series of hydrology and climate statistics since 2010. These values are intended for land use planning, water quantity management, and climate change studies.
A 2012 three-week field program collected mapping, geochemical samples, and phosphate minerals from the Rapid Creek Formation in Yukon. The formation is a phosphorite-rich ironstone facies with alternating phosphate and siderite-rich mudstones. Secondary minerals from veins and nodules include rare phosphate species like apatite, augelite, and arrojadite-group minerals.
From September 2018 to present, this dataset provides daily binary inland surface water classification at a 0.01-degree (~1 km) resolution. It is derived from the CYGNSS satellite constellation using the UC Berkeley Random Walk Algorithm (Berkeley-RWAWC), generating maps with an approximate 6-day latency. The product is recommended for operational use, with a separate monthly version suggested for science applications.
A preregistered survey experiment in China provides evidence on how enforcement gaps in pro-labor legislation shape mass political attitudes. The research includes a follow-up study incorporating a list experiment to detect preference falsification. The data, authored by Hsu Yumin Wang and associated with Public Opinion Quarterly, was last updated on June 10, 2026.
2024-2025 student population data from the Colombian Institute of Classical Ballet (INCOLBALLET). The dataset contains general demographic and socioeconomic information for students, published by datos.gov.co. It was last updated on 2026-05-18.
VIIRS/NPP imagery provides high-resolution swath data at 375 meters for five spectral bands spanning visible to thermal wavelengths. Each product file contains sensor data for a six-minute acquisition window from the Suomi NPP satellite overpass. The data is structured in NetCDF files with a naming convention that encodes the acquisition year, day, hour, minute, and collection number.
Washington law mandates entities report data breaches affecting over 500 residents to the Attorney General's Office. This dataset details the specific types of personal information compromised in each reported breach, supporting the AGO's Annual Data Breach Report. It is maintained by data.wa.gov and was last updated in April 2026.
Between May 29 and June 19, 2017, benthic sediment sampling was conducted in inner Darwin Harbour and shallow water areas in and around Bynoe Harbour. This dataset comprises area-based rates of total carbon dioxide production and oxygen uptake in marine sediments, collected as part of a four-year science program from 2014 to 2018. The project was led by the Northern Territory Government and involved partners Geoscience Australia, the Australian Institute of Marine Science, and the Department of Environment and Natural Resources.
Sediment oxygen demand measurements were collected from seabed sediments in inner Darwin Harbour and shallow waters around Bynoe Harbour. The surveys were conducted between May 29 and June 19, 2017, as part of a four-year science program from 2014 to 2018. Partners included Geoscience Australia, the Australian Institute of Marine Science, and the Northern Territory Government's Department of Environment and Natural Resources.
Global data for 151 man-made reservoirs and 13 regulated natural lakes. The National Aeronautics and Space Administration product provides monthly reservoir area, elevation, storage, and evaporation rates derived from MODIS satellite imagery and meteorological models. The dataset was last updated on March 12, 2026.
30 daily shards of News Crawl data from the first half of 2020, used for reproducing continual-learning experiments in an ACL 2025 paper. The dataset includes a GPT-2 small checkpoint and was uploaded by viniferagy. It was last updated on 2026-06-18.
The Terra MODIS Water Reservoir Monthly Level 3 product provides monthly data for 164 global water bodies, including 151 man-made reservoirs and 13 regulated natural lakes. It is produced by NASA and contains composite data on reservoir area, elevation, storage, and evaporation rates derived from MODIS imagery and meteorological models. The dataset's latest version is 6.1, with a metadata update noted for 2026-03 12.
More than 1,000 expert-curated samples across 10 scientific disciplines probe the Scientific General Intelligence of LLMs. The SGI-Bench evaluates models across the full inquiry cycle—Deliberation, Conception, Action, and Perception—inspired by Science's 125 Big Questions. It was created by InternScience and last updated in June 2026.
The Netherlands' central government procurement expenditure data for 2023, aggregated from all ministries and their subordinate departments. The Ministry of the Interior and Kingdom Relations collects this information annually to provide cross-government procurement insights and report to parliament on public spending and SME participation. The dataset is published under a CC0 1.0 license.
Full Load Equivalent (FLE) enrollment statistics for Alberta's publicly funded post-secondary institutions. The data tables cover approved programs and include separate files focusing on system-level totals, international learners, and self-identified Indigenous learners. The dataset is provided by the Government of Alberta and was last updated on April 17, 2026.
Broadhectare Residential Land 2019 identifies undeveloped land for residential development on metropolitan fringes. The database details subdivision and planning status, site area in hectares, and potential lot yield for each record. It was produced by the Department of Transport and Planning and last updated in April 2026.
Statistics on registrations and persons covered for non-group supplementary health plans in Alberta. The data likely contains counts by coverage category and premium payment level. This table is an Excel version from the annual Alberta Health Care Insurance Statistical Supplement report published by Alberta Health.
A table of statistics on the ten highest prescription drug expenditures by net payment and coverage category for non-group supplementary health plans in Alberta. The data is published annually by Alberta Health in the Alberta Health Care Insurance Statistical Supplement report. Supplementary health plans are funded by Alberta Health and administered by Alberta Blue Cross.
Water surface elevation measurements were collected via AirSWOT's Ka-band radar interferometry during the 2021 Delta-X campaign. The dataset provides point-based elevation data for the Atchafalaya and Terrebonne basins in coastal Louisiana, processed to Level 3 by filtering and averaging heights. It is produced by the ORNL_CLOUD organization and was last updated in March 2026.