Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
41,943 datasets
Yoann Poulet created a reference dataset of geometrical and inertial properties for manual wheelchairs, published on figshare in 2026. It contains measurements for 61 manual wheelchairs of different types, 18 rear wheels, 12 caster wheels, and 12 caster forks. The dataset includes parameters like mass, 3D center of mass location, and yaw mass moment of inertia at both system and component levels.
A collection of raster datasets from Natural Resources Canada capturing temporal patterns of historic flood events from 2000 to 2023. The layers include extreme wet and dry year analyses, trend slopes, susceptibility envelopes, and current estimates to support decision-making for planning and management. Data is provided in formats including WMS, HTML, and JSON.
A 2026 dataset from the Registers of Scotland contains the location of ownership polygons at ground level in Scotland, produced to comply with the INSPIRE Directive. Each polygon shows the position and indicative extent of ownership for a registered property and is linked to a unique INSPIRE ID. The dataset is a sub-set of the Cadastral Map and does not establish the full extent of rights contained within a registered title.
Registers of Scotland maintains this dataset of cadastral parcels to comply with the INSPIRE Directive. It contains polygon shapes representing the position and indicative extent of surface ownership for each registered property in Scotland, with each parcel linked to a title via a unique inspire id. The data is organized by the 33 Registration Counties and is provided in ESRI Shapefile format.
4.4 GB of multimedia resources support a 2025 BioRxiv study on subjective color perception. Eiji Watanabe and colleagues published test images, training videos, code, and initial model weights for their artificial neural network models. These models generate artificial subjective color to investigate the underlying mechanisms of color perception illusions.
Approximately 100,000 sample records detail seabed sediment characteristics from Australia's marine jurisdiction, including the Australian Antarctic Territory. The database, managed by Geoscience Australia, includes analytical properties such as grain size, carbonate content, mineralogy, geochemistry, and age determinations. New data are added as they become available.
Experimental data on covalent inhibitors designed to combat New Delhi metallo-Ξ²-lactamase-1 (NDM-1) antibiotic resistance. The dataset includes IC50 values for inhibitory activity, such as 3.27 ΞΌM for compound 16a, and meropenem MIC fold-change data from bacterial isolates and a mouse infection model. It was authored by Wandong Liu and uploaded to figshare on 2026-05-15.
31 tables of greenhouse gas emissions and removals estimates compiled by USDA Forest Service researchers from Nationwide Forest Inventory data. These data provide 1990-2024 estimates of carbon net flux and 1990-2025 estimates of carbon stocks for the United States and its territories. The estimates are intended for sub-national reporting and further analysis, incorporating new data and methods into the entire time series.
The H.S. Bostock Core Library repository contains drill core and rock samples from Yukon mineral properties and geological mapping programs. This dataset provides collar coordinates and associated metadata for the sample collection points. The data is published by the Government of Yukon under an open license.
Derived bathymetric isobaths (depth contour lines) generated from the Geoscience Australia AusBathyTopo 250m 2024 gridded dataset. The dataset includes four contour products at intervals of 100m, 5m, 1m, and a multi-resolution composite, clipped to the Australian Exclusive Economic Zone. It is provided by the Australian Ocean Data Network and was last updated in April 2026.
Laser ablation ICP-MS data reveals concentrations of precious and semi-metal trace elements in Fe-Ni-Cu sulfide minerals from the lower crust. The data were gathered between 2019 and 2021 at Cardiff University as part of the FAMOS project. More information is available in an open access paper by Holwell et al (2022) in Nature Geoscience.
Hiromu Tanimoto's dataset from 2026 contains validation results for commercial antibodies targeting the Leukocyte immunoglobulin-like receptor (LILR) family. The data likely includes specificity assessments for 11 LILR members and results from a novel sandwich ELISA developed to detect the soluble LILRA3 protein. The findings highlight cross-reactivity issues and provide new monoclonal antibodies for LILRA3 detection.
Tropical islands in the study region host two frog species, Pristimantis urichi and P. charlottevillensis. The dataset contains supplementary tables from a 2023-2024 study using passive acoustic monitoring at ten sites to analyze how air temperature affects call rate and duration. Author Renoir Auguste published the data under a CC-BY-4.0 license in 2026.
Three sediment cores from Nara Inlet reveal a 3000-year record of mixed clastic and carbonate accumulation on the central Great Barrier Reef middle shelf. The top 3 meters of sediment, where carbonate content is 25.80% by weight, accumulated within the last three millennia. Data from Geoscience Australia indicates a system where both sediment types have decreased over time, with clastic input declining faster.
48,078 audio-text pairs totaling 89.63 hours of Hindi speech, collected from YouTube using auto-generated captions for transcription. The dataset is heavily skewed, with one speaker contributing 76.9% of the content. Created by user somu9 and last updated on June 8, 2026.
17 occupational therapists scored the potential usefulness of a new rear anti-tip device (Arc-RAD) across 20 scenarios. The dataset includes quantitative survey scores and qualitative discussion summaries from a mixed-methods study. R. Lee Kirby authored the study, which was published on figshare in April 2026.
Middle Pleistocene zircon (U-Th)/He ages, with a weighted mean of 0.63 Β± 0.19 Ma, constrain the timing of coal seam combustion and combustion metamorphic rock formation in the Junggar Basin. The dataset, created by Bin Chen and last updated in May 2026, includes three tightly clustered ages and two older single-grain ages interpreted as pre-combustion components. It is a small dataset (5.5 KB) stored in an XLS file format.
Simulation and analysis data for spin models with 6 and 8 sites, supporting a manuscript on quantum dynamics. The dataset includes raw data, Python scripts, and figure files. It was authored by Feng Zhang and last updated on 2026-05-19.
14,739 code smell annotations from 522 repositories in the MLCQ benchmark, alongside 1,840 developer evaluations and 40 interview transcripts. Zijie Huang published this replication package on figshare in April 2026 under a CC-BY-4.0 license. The package includes datasets, source code, and scripts for reproducing research on automated code smell detection.
Karla Gonzalez's replication package for the MSR 2026 paper 'When Bots Get the Boot' contains data and scripts to reproduce empirical results on agent-authored pull request rejections. The 133.8 MB package includes a curated dataset of pull requests labeled by outcome, derived feature sets, and analysis scripts. It was last updated on May 6, 2026.