Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,560 datasets
A curated collection of traditional Amazigh proverbs, last updated on 2026-06-06. The dataset includes dialect mapping, linguistic explanations, and translations into Arabic and English. It was created by author abdelhaqueidali on the Hugging Face platform.
OPGD single-factor detection results contain data from a spatial analysis of 342 nationally recognized traditional villages in China's Guangxi Zhuang Autonomous Region. The dataset was created by Jianmei Tan and last updated on April 21, 2026. It is a 5.5 KB XLS file, suggesting a small, focused set of analytical outputs.
A 5.5 KB dataset analyzing the spatial distribution patterns and driving factors of 342 nationally recognized traditional villages in China's Guangxi Zhuang Autonomous Region. The dataset, authored by Jianmei Tan and last updated in April 2026, employs methods including nearest neighbor index, kernel density estimation, and geographical detector analysis. It outlines a conservation framework for traditional settlements in ethnic minority regions.
Main datasets and sources from figshare by Jianmei Tan, last updated April 2026. The 9.5 KB XLS file contains data on 342 nationally recognized traditional villages in China's Guangxi Zhuang Autonomous Region. The research analyzes spatial distribution patterns and driving factors using geographical detectors and spatial statistics.
Experimental data for CN-3, a potent RET inhibitor active against clinically relevant resistance mutations. The dataset includes IC50 values for RET mutants and cellular models, such as TT cells (IC50 = 2.48 ± 0.78 nM) and LC-2/ad cells (IC50 = 17.05 ± 4.90 nM). It was authored by Zi-Xuan Wang and last updated on 2026-05-02.
A figshare dataset from 2026 by Zi-Xuan Wang details the experimental results for CN-3, a novel RET kinase inhibitor. The data includes IC50 values for CN-3 against multiple RET mutants and cell lines, such as TT and LC-2/ad cells, and results from kinase profiling and xenograft studies. The dataset is provided as a PDB file and is licensed under CC-BY-NC-4.0.
Zi-Xuan Wang reports CN-3, a potent RET inhibitor active against clinically relevant mutants. The dataset likely contains potency measurements (IC50 values) for CN-3 against RET mutants and cellular models, as well as kinase selectivity profiling data. The data was last updated on 2026-05-02.
Zi-Xuan Wang reports CN-3, a potent RET inhibitor active against clinically relevant mutants. The dataset likely contains potency measurements (IC50 values) for CN-3 against RET mutants and cellular models, as well as kinase selectivity profiling data. The data was last updated on 2026-05-02.
387.8 KB of data from figshare describes CN-3, a potent RET inhibitor. The dataset, authored by Zi-Xuan Wang and last updated in May 2026, reports IC50 values below 5 nM against multiple clinically relevant RET mutants, including G810R/S/C and V804M. It includes results from cellular proliferation assays (e.g., TT cells, IC50 = 2.48 ± 0.78 nM) and in vivo xenograft studies.
Portuguese literary corpus of 84 public-domain works of Portuguese and Brazilian classic literature, prepared for character-level language modeling. The dataset, created by vreabernardo, spans authors from Eça de Queirós and Machado de Assis to Florbela Espanca and Mário de Sá-Carneiro. It was last updated on 2026-06-19.
Pireh Pirzada's dataset contains transcribed and thematically analyzed interview data from a 2026 study on smart home technology adoption among older adults. The data likely includes qualitative responses from 21 participants aged 65-90 years from the UK, Malta, and Pakistan. Findings focus on incentives, barriers, and cross-cultural differences in adoption during the COVID-19 pandemic.
Supplementary Online Material details the technical specifications of the COMSOL Multiphysics 6.3 simulation environment. It encompasses physics interface configurations, numerical stabilization techniques, meshing strategies, and solver settings. The document was authored by Hao Wang and uploaded to figshare.
The north-west European continental shelf is covered by spatial predictions of mud, sand, and gravel fractions as continuous variables. The dataset includes raw raster predictions of these fractions, derived classifications using multiple sediment schemes, and accompanying error or accuracy maps for each prediction. It was published by the Government Digital Service on the eu_open_data platform.
Supplementary material includes computed structural parameters, NMR spectra, and vacuum ultraviolet photoionization mass spectra for the molecule cis-3-penten-1-yne. The 4.9 MB collection, authored by Sung Man Park, contains a DOCX file with tables, schematics, and figures. It was last updated on May 8, 2026, and is shared under a CC-BY-4.0 license.
RTK GPS soil surface elevation measurements for impounded and natural regeneration mangroves in Sanibel, Florida. The data was collected from 23-26 September 2025 by Charles F. McKenzie to inform a restoration plan involving tidal reconnection and dredged sediments. The dataset is 1.7 MB in size and includes CSV, XML, and HTML files.
MudawanSn is a publicly available parallel corpus containing 1,271 sentence-aligned pairs for the Wolof–Arabic language pair. The corpus consists of manual translations from Wolof into Modern Standard Arabic, with source texts drawn from the MasakhaNER corpus. It was created by author mbaye930 and last updated on the platform in June 2026.
A synthetic Brazilian public-health schema and test set created by Boakpe for evaluating cross-database generalization in Text-to-SQL agents. The fine-tuned model was not trained on trajectories from this schema, making it a dedicated benchmark. The dataset was last updated on June 15, 2026.
GESTIÓN DEL RIESGO data from www.datos.gov.co describes the social process of planning, executing, monitoring, and evaluating permanent policies and actions for disaster risk knowledge, prevention, reduction, and recovery. The dataset includes columns for the risk management process, event date, department, municipality, and associated codes. It was last updated on 2026-05-19 15:23:57.
Indonesian political news articles collected through automated web crawling from various online news sources. The dataset was created by billalxcode and last updated on June 6, 2026. It is intended for research and development purposes.
Cauca Department's 2019 public contract accountability data from the General Comptroller's Office. The dataset tracks the number and monetary value of contracts that were and were not reported on. It includes columns for total registered contracts, reported contracts, unreported contracts, and their corresponding values.