Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
41,488 datasets
2010 and 2011 aggregated data from over 100 visitor centers in British Columbia. The dataset contains year-over-year comparison percentages for metrics like average parties per hour, total hours of operation, and tourism-related inquiries. It was published by the Government of British Columbia.
FBF-VAST introduces a valve-assisted modulation technique for two-dimensional liquid chromatography (LC × LC) using a 4-port valve. The dataset includes simulation results from advection–dispersion equations and experimental LC × LC data from red wine sample separation, yielding 27–55 separated peaks. Authored by Pattraporn Chobpradit and last updated in May 2026, the data is shared under a CC-BY-NC-4.0 license.
33,567 articles from three major CDC journals provide a structured corpus for public health text analysis. Metadata includes titles, authors, keywords, and publication details for the Morbidity and Mortality Weekly Report, Emerging Infectious Diseases, and Preventing Chronic Disease. The collection was compiled from source content retrieved on 2024-01-09 and published on the data.cdc.gov platform.
61.9 MB of images supporting research into vulvovaginal candidiasis (VVC), which affects over 75% of women. The dataset, authored by Bianca M. Coleman and last updated in May 2026, reveals a role for IL-1/Type 17 immunity in VVC that operates independently of estrogenic hormones.
1.0 MB of TIF image files from a preclinical study exploring the synergistic therapeutic effect of Cryptotanshinone and Matrine on ovarian cancer cells. The dataset, authored by Haiying Xu and last updated in May 2026, includes results from CCK-8, flow cytometry, Transwell, and qRT-PCR assays, as well as in vivo nude mouse xenograft model data. The research investigates the role of the PI3K/Akt/mTOR pathway in reversing cisplatin resistance.
A PDF document authored by Ryo Hakoda, published on figshare under a CC-BY-4.0 license on May 13, 2026. The file describes a symmetry-assisted deep reinforcement learning framework designed for stable and robust control of morphologically symmetric robots, such as humanoids and quadrupeds. It proposes modeling the environment as a symmetric Markov decision process and constructing a full-body policy from a single-sided base policy.
32,950 model queries evaluated nine large language models on structured electronic health record tasks. The dataset, authored by Eyal Klang and last updated in May 2026, contains results from a study sampling 50,000 emergency department visits to test prompting strategies like direct, chain-of-thought, and tool-based code generation.
Sujesh Sudarsan published this dataset on figshare in April 2026. It contains experimental results for zinc oxide nanoparticles synthesized using almond peel extract. The data likely includes measurements of material properties and photocatalytic degradation efficiency for Congo red dye under sunlight.
Panax quinquefolius L. is a perennial medicinal herb vulnerable to cold stress. This study provides the first genome-wide characterization of 123 GRAS transcription factor members in P. quinquefolius, identifying PqGRAS086 as a candidate for cold tolerance studies. The dataset was authored by Junmei Lian and last updated on 2026-05-20.
123 GRAS transcription factor members identified in the medicinal herb Panax quinquefolius, classified into 13 subfamilies. The dataset, created by Junmei Lian and last updated in May 2026, provides the first genome-wide characterization of these genes. Transcriptomic profiling suggests the PAT1 and DELLA subfamilies may coordinate stress-adaptive responses, with PqGRAS086 highlighted as a candidate for cold tolerance studies.
A study from a germplasm bank in Zacatecas, Mexico, examines intraspecific variation in seven guava genotypes, including wild, landrace, and cultivated types. It measures tree, leaf, and fruit morphology, soluble solids, water content, and phenolic compounds, alongside folivory and frugivory. The dataset was authored by Johnattan Hernández-Cumplido and last updated on 2026-05-20.
One of the first systematic studies of adversarial robustness in LLM-based multi-agent systems applied to engineering contexts. The research investigates error propagation under controlled adversarial influence across problems like pipe pressure loss, beam deflection, and graph traversal. Authored by Lorenz Wiesmeier and shared under a CC-BY-4.0 license on figshare in May 2026.
Survey data from 424 Chinese employees examines the relationship between generative AI use and psychological distress. The study tests a moderated mediation model with job insecurity and workplace loneliness as mediators, and information literacy and AI ethical risk perception as moderators. The dataset was authored by Meng Liu and last updated on 2026-05-20.
A 2026 survey of 182 naturally occurring groups engaging with media stimuli in an ecologically valid setting, authored by Johanna Schindler. The data supports the Model of Collective Information Processing (MCIP), which conceptualizes group-level media processing along dimensions of systematicness and openness. Results are presented in a 217.1 KB PDF file.
Geoscience Australia Data provides a marine geology dataset describing the continental shelf off southeast Australia between Sugarloaf Point and Gabo Island. The dataset likely contains morphological, sediment, and seismic profile data for the shelf, which varies in width from 72 km to 17 km and features distinct inner, middle, and outer zones. It was last updated on 2026-05-14.
Development Log Current Edition 2026 Q1 provides a quarterly updated record of large-scale development projects in Cambridge, Massachusetts. The table includes 29 columns detailing project status, location, use, and metrics like total gross floor area and residential units. Data is published by data.cambridgema.gov and was last updated on April 21, 2026.
38 indicators from the Youth Health Monitor track behaviors and perceptions among young people in Utrecht. The dataset includes metrics on substance use, mental health, bullying, physical activity, and social well-being. It is published by the municipality's Research & Advice department under the theme 'Youth (Health)'.
Between May 29 and June 19, 2017, benthic sediment sampling was conducted in inner Darwin Harbour and shallow waters around Bynoe Harbour. The dataset comprises total chlorin and chlorin index measurements on seabed sediments, collected as part of a four-year (2014-2018) science program led by the Northern Territory Government and supported by Geoscience Australia, AIMS, and the INPEX-led Ichthys LNG Project.
A collection mandated by decree 2024-01 documents AI use cases within Quebec's public bodies, excluding cybersecurity projects. The data includes the initiative name, category, responsible body, ministerial portfolio, benefits, and status. Information was gathered from public organizations following a February 2024 decree by the Minister of Cybersecurity and Digital Technology.
Geoscience Australia compiled geochemical data from mafic to intermediate rocks in the Loch Lilly-Kars Belt within the Delamerian Orogen. The data was sourced from MinEx CRC National Drilling Initiative campaigns and legacy drillholes in New South Wales and South Australia. It distinguishes two Cambrian magmatic events and characterizes Siluro-Devonian rocks.