Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
42,781 datasets
Julia Neumann's research dataset from 2026 provides molecular-scale structural data for the electrical double layer at a single-crystal alumina–water interface. It combines experimental X-ray reflectivity and streaming potential measurements with ab initio molecular dynamics simulations across a pH range of 3 to 12. The data includes adsorption heights for water layers and rubidium ions, linking surface protonation to ion binding distances.
OSNI provides a polygon dataset of Standard Electoral Wards set in 1993. The data has been extracted from the OSNI Largescale database, topologically cleansed, and attributed to create a seamless dataset. This service is published for OpenData by the Government Digital Service.
Static (bathtub) inundation modelling was carried out for a 1% Annual Exceedance Probability coastal inundation scenario assuming 1.1m Sea Level Rise in 2016. The data covers 14 study areas along the coast of the Bellarine Peninsula and Greater Geelong, including Avalon, Breamlea, and Portarlington. The dataset was created by the Department of Energy, Environment and Climate Action and is available on data.gov.au.
Colombia's Superintendency of Transport consolidates port traffic data by product and port society across the country's port zones, with records of imports and exports of articles and products. The dataset includes monthly records starting from January 2018. The published information is referential, and only the Superintendency of Transport can certify the data for each port society.
April 22, 2013 to September 30, 2014 is the temporal coverage of this collection of browse-only satellite data files from the Hurricane and Severe Storm Sentinel (HS3) field campaign. The dataset includes JPG images containing brightness temperature, rain rate, and RGB composite imagery. Its primary scientific goals were to assess the roles of large-scale environmental processes, the Saharan Air Layer, and deep convection in tropical storm formation and intensification.
Geoscience Australia's National Base Map provides seamless topographic colour mapping for Australia and its external territories, sourced from GEODATA TOPO 250K, OpenStreetMap, and ACLUMP vegetation data. Road data beyond 1:100,000 scale is derived from OpenStreetMap, while suburbs are sourced from Australian Bureau of Statistics data from 2018 and 2020. The topographic information was checked in 2008 using satellite imagery and supplemented in 2009, with limited field checking noted.
Japanese conversational speech recorded using mobile devices in real-world settings. The dataset, created by MagicHub, has a total duration of 10 hours and was last updated on June 15, 2026. It is designed to capture the interactive nature of everyday communication.
A Wikipedia Markdown knowledge corpus constructed around the SciCode benchmark. The dataset consists of two parts: an original seed concept set derived from SciCode seed cards and an expanded concept set generated by a language model. The dataset was created by dabingzz and last updated on June 10, 2026.
The Libyan Fazzan region provides fossil sand dune samples from the Zarzur Formation. The dataset includes magnetostratigraphy dating, particle size, surface micromorphology, sand provenance, and heavy mineral data across 9 Excel sheets. Mark W Hounslow authored this 1.1 MB dataset, last updated in May 2026.
Digital vector boundaries for all Local Planning Authorities in the United Kingdom as of January 2026. The data is provided by the Office for National Statistics and contains both Ordnance Survey and ONS Intellectual Property Rights. Boundaries are clipped to the coastline at the Mean High Water mark.
A slim, flattened derived version of the Natural Questions dataset for short-answer question answering experiments. The dataset, created by BOB12311, removes original document HTML and long answer candidates to provide simple question-answer pairs. It was last updated on the platform on 2026-05-17.
5.5 KB of parameters for the nonlinear function g_h referenced in Eq (37) of a PLOS ONE article. The dataset was authored by Phoebe Smith and last updated on June 1, 2026. It is available in XLS format under a CC-BY-4.0 license on figshare.
The Bellarine-Corio Bay Local Coastal Hazard Assessment (LCHA) provides a static (bathtub) inundation model for a 1% Annual Exceedance Probability coastal flood event, assuming a 0.5-meter sea level rise. It was created in 2016 for 14 study areas along the Bellarine Peninsula and Greater Geelong coast. The dataset is provided by the Department of Energy, Environment and Climate Action and is available in multiple geospatial formats.
David Sabin-Miller's 2026 study compares internal and external measures of political ideology. The dataset likely contains survey responses for self-identification, policy-stance agreements, and external assessments of political opinion statements. This 1.6 MB ZIP file supports the use of public responses as meaningful, comparable quantities projected onto a one-dimensional ideological domain.
93 point observations of vertical motion rates, compiled from 13 published sources, were harmonized and interpolated to create raster surfaces for the Sunda Shelf. The dataset, created by Nina de Munck, includes a structural geology framework and two alternative motion scenarios. It was last updated on 2026-05-28.
Fluorescence lifetimes recorded from 10 simulation runs per condition, each for 100 million steps. The dataset, created by Vincent Ebert and last updated in May 2026, contains results from a computational model where parameters like fluorophore count and energy transfer rates were systematically adjusted. It is a small dataset of 5.5 KB, stored in an XLS file.
Total number of larval mosquitoes collected across Cambodia by species and breeding habitats. The dataset distinguishes between anthropized artificial habitats, anthropized natural habitats, and natural habitats, and marks known disease vectors. It was authored by Bros Doeurk and last updated on 2026-05-18.
24 healthcare professionals across three nursing homes participated in two consecutive focus group sessions investigating the potential value of social robots in daily care. The dataset likely contains qualitative data from these discussions, including expected benefits and barriers to implementation. Sanja Balalic published the data on figshare in May 2026.
Formulas for generalized linear mixed models with binomial or negative binomial distributions, assessing plot type effects on species presence or activity. The dataset, authored by Chloé Tavernier and last updated in June 2026, contains model specifications performed separately for each species. Location was incorporated as a random variable in the analysis.
Static (bathtub) inundation modelling was carried out for 14 study areas along the Bellarine Peninsula and Greater Geelong coast. The dataset models the 1% Annual Exceedance Probability (AEP) coastal inundation under the assumption of 0 meters Sea Level Rise in 2016. It was created by the Department of Energy, Environment and Climate Action as part of the Bellarine-Corio Bay Local Coastal Hazard Assessment (LCHA).