Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
39,937 datasets
A dataset containing results from a numerical simulation of magnetohydrodynamic (MHD) flow of a second-grade fluid over a vertical surface. The data was generated by Ruchika Mehta using the Bvp4c solver and includes predictions from a Multiple Linear Regression model for engineering quantities. The dataset was last updated on June 4, 2026.
The Harris Greenstone Domain is a late Archean-Proterozoic terrane in the centre of the Gawler Craton in Australia. This map, produced as a GeoPDF by the Australian Ocean Data Network, visualizes the domain's structure based on interpretation of aeromagnetics, gravity, and diamond drillcore. Layers can be toggled for customized views, and the map is georeferenced for compatibility with other geographic data.
NOAA-20 VIIRS Land Surface Temperature and Emissivity (LST&E) 8-day product (VJ121A2) is a Level 3 composite dataset at 1-kilometer resolution. It merges daily daytime and nighttime acquisitions over an 8-day period into a single HDF file containing 11 science datasets, including LST, quality control, view angle, observation time, and emissivity for three spectral bands. The product is algorithmically aligned with the MODIS MOD21A2 dataset to ensure continuity within NASA's Earth Observation System.
A 5.5 KB dataset from figshare, last updated May 22, 2026, by author Zhe Jing. It contains results related to the generalization performance of the RPINet model on the Toronto-3D dataset, likely involving semantic segmentation metrics. The dataset is shared under a CC-BY-4.0 license.
South Korea was the location for a 12-week, single-arm pilot feasibility study across 11 community pharmacies. The dataset contains general characteristics for 30 adults with suboptimally controlled type 2 diabetes who participated in a program using continuous glucose monitoring integrated with digital health platforms. Author Kyung-In Joung published the data on figshare under a CC-BY-4.0 license.
A combined dataset used to evaluate a hybrid deep learning framework for Intelligent Transportation Systems. The data includes traffic volume from the Metro Interstate and accident data from Barcelona, as referenced in the description. The dataset was authored by Mohammed Saad Javeed and last updated on 2026-05-29.
4406450167 bytes of data from the Bay Area Energy Atlas link PG&E metered energy consumption to building characteristics and sociodemographic data. The dataset provides monthly energy consumption statistics from 2015 to 2021, aggregated at county, city, census tract, and zip code levels. Data is further aggregated by building use type, vintage, size, area median income, and CalEnviroScreen percentile range.
Quebec's vector grid system provides a spatial and statistical infrastructure for integrating environmental and socio-economic data. The Institut de la Statistique du Québec (ISQ) maintains this system, which includes a 1 km² grid covering all of Quebec and a 50 m side grid for areas south of the 52nd parallel. The dataset was last updated in April 2026.
A 2.3 KB dataset from figshare, authored by Xin Tang and last updated in May 2026, describes the preclinical evaluation of a BRD9-targeting PROTAC compound. The data likely contains results from in vitro cell line assays and in vivo xenograft models, including IC50 and tumor growth inhibition rates. The findings support the compound's advancement as a therapeutic candidate for acute myeloid leukemia.
A 64.1 MB replication package for research on code smell detection, last updated on 2026-05-28. It includes raw and processed datasets, pre-computed code embeddings from models like CodeBERT and UniXcoder, and scripts to reproduce experimental results. The package is authored anonymously and shared under a CC-BY-4.0 license on figshare.
A 30-day pilot study from 2026 by Dhruv Nimbalkar, assessing a next-generation ingestible sensor system for measuring HIV pre-exposure prophylaxis (PrEP) adherence. The dataset includes results from 15 participants, with adherence measured via the digital pill system, pill counts, and user feedback via the System Usability Scale. It contains 49.9 KB of data in an Excel file, published under a CC-BY-4.0 license.
Twenty competitive female cyclists were interviewed in a study aiming to understand their experiences with the menstrual cycle's impact on training and competition. The dataset likely contains qualitative themes from semi-structured interviews, with findings published by author Louise Burnie. The dataset was last updated on June 2, 2026.
Keppel Bay, a shallow coastal embayment adjacent to the Fitzroy River in Queensland, Australia. The dataset, from the Australian Ocean Data Network, describes the complex seabed morphology and sediment distribution shaped by Late Quaternary sea-level changes, tidal processes, and flood events. It reveals the palaeo-path of the Fitzroy River across the continental shelf and infilling patterns over the last few thousand years.
Australian Ocean Data Network hosts a geological dataset describing the Proterozoic Davenport province in central Australia. The description details rock formations, stratigraphy, geochemistry, and mineralisation, including recorded production of about 4500 tonnes of tungsten concentrates and 15 kg of gold. The dataset was last updated on 2026-06-05.
A retrospective cohort study of 48,288 adults without diabetes at baseline, derived from a publicly available dataset on the DRYAD platform. The data originates from 11 cities and 32 locations in China and was used to examine the non-linear relationship between the triglyceride glucose-body mass index (TyG-BMI) and incident diabetes risk. The dataset was published by Shanshan Xiao under a CC-BY-4.0 license on May 25, 2026.
Jan Kuska describes an automated process for manufacturing chimeric antigen receptor (CAR)-engineered gamma delta T cells for cancer immunotherapy. The protocol details a 374-fold expansion of Vγ9Vδ2 T cells, achieving an average yield of 6.64×10^9 cells after 14 days, with a mean transduction efficiency of 57.4%. The document was last updated on 2026-05-22.
Seven randomized controlled trials involving 1476 patients were analyzed by Zhen-guang Zhao. The meta-analysis compares traction-assisted versus conventional endoscopic submucosal dissection for superficial gastric neoplasms, evaluating procedure time, resection completeness, and adverse events. The results were published on figshare in May 2026.
Baseline data from the Allergic Rhinitis in pediatric subjects with Nasal Septum Deviation (ARHINASD) study investigates the relationship between nasal septum deviation and allergic rhinitis in children. The dataset includes results from a multicenter prospective study of 138 participants aged 6–14 years, conducted by Laura Carucci. The data was last updated on 2026-05-22.
Quarterly aggregated kelp cover data from the Floating Forests citizen science project. The dataset contains consensus-classified polygons of surface-canopy forming kelp, primarily giant kelp (Macrocystis pyrifera), derived from Landsat satellite imagery. Each image was classified by up to fifteen citizen scientists, with polygons tagged by the minimum number of users who identified kelp.
A 5.5 KB dataset contains design parameters for a compact serrated boundary fractal planar quad-element MIMO antenna. The antenna, designed by Tathababu Addepalli and shared on figshare, resonates at four distinct mmWave frequency bands: 24.5 GHz, 33.5 GHz, 38.0 GHz, and 44.0 GHz. The dataset was last updated on 2026-06-03.