Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
41,487 datasets
Colombian municipal and district-level financial transfers from the General System for poor, uninsured populations from 2015 to 2021. The dataset includes columns for payer, payment orders, identification numbers, payment dates, concepts, funding sources, and transferred values. It is hosted by the Colombian government's open data portal, datos.gov.co, and was last updated in May 2026.
Supplementary materials for a pilot evaluation of 17 open-weight large language models screening RNA-seq metadata. The dataset includes performance metrics like AUPRC and F1 scores, runtime distributions, and reproducibility data across 150 projects per model. Mitsuo Shintani authored this CC-BY-4.0 licensed dataset, last updated in May 2026.
A benchmark dataset compiled for evaluating the DMAPLM multimodal pretrained framework for drug repositioning. The dataset was created by Hailin Chen and last updated on April 22, 2026. It is a small dataset, 5.5 KB in size, stored in an XLS file format.
290 archaeological sites and museums across Mexico are covered by this dataset of monthly visitor counts from January 1996 to the present. The data, sourced from Mexico's National Institute of Anthropology and History (INAH), is disaggregated by state, site type, visitor type, and nationality. Montserrat Mora published the dataset on figshare, which includes a CSV file, Python analysis scripts, and sample visualizations.
Data collection for the Earth Radiation Budget Experiment (ERBE) S-9 Scanner Radiant Flux dataset is complete. It contains inverted daily, monthly hourly, and monthly averages of shortwave (SWF) and longwave (LWF) radiant fluxes at the top-of-atmosphere (TOA), averaged into 2.5-degree geographic regions. The data originates from scanner instruments on three satellites: NASA's ERBS and NOAA's NOAA-9 and NOAA-10.
Experimental results for a unified AI framework integrating diffusion models and multimodal large language models for cultural tourism content. The dataset, shared by Yuhe Wang on figshare, includes performance metrics from a study proposing a closed-loop generation-prediction-feedback paradigm. It was last updated on April 22, 2026.
Yopal, Casanare, Colombia, has a database of identified risk scenarios for natural hazard threats. The dataset includes georeferenced data on vulnerable zones, likely supporting risk management and territorial planning. It is hosted by datos.gov.co and was last updated on 2026-05-28.
A 221.0 KB Excel dataset by Cristina de la Cruz Serna, last updated in June 2026, analyzes the pragmatic and sociolinguistic functions of the laughter expression 'ha ha ha'. The research is based on data from the Global Web-Based English Corpora (GloWbE), focusing on English usage in the USA, the UK, Nigeria, and India.
Monthly panel data from January 2015 to December 2024 tracks 11 variables influencing China's new energy vehicle market. The dataset includes supply-side, macroeconomic, and energy production indicators to analyze and forecast NEV adoption trends. Author Changling Li released this 22.0 KB XLSX file under a CC-BY-4.0 license.
A multimodal dataset collected to investigate the role of BMPR1A in Gli1+ periodontal ligament stem cells (PDLSCs) for maintaining periodontal bone homeostasis. The data includes results from conditional knockout mouse models analyzed via Β΅-CT, histology, and immunohistochemistry, alongside human PDLSC knockdown experiments assessed via TRAP staining, ELISA, western blot, and proteomics. It was authored by Xudong Xie and last updated on May 23, 2026.
Processed mouse brain spatial transcriptomics and Imaging Mass Cytometry spatial omics data used to demonstrate the stVirtual tool. The dataset includes an example route between slices T170 and T171 and a compact demo for the workflow. It was last updated on 2026-05-23 by author Yijin Zhou and is provided under a CC-BY-4.0 license for demonstration and reproducibility.
41.1 MB of data contains multiple lists of predicted and cataloged seamounts. Zhenyu Wang published these files on figshare, last updated in May 2026. The dataset includes raw machine learning predictions, post-processed results, and updated versions of two existing seamount catalogs.
A 374.4 KB DOCX file authored by Anna Davidovich, last updated on 2026-05-20. It contains supplementary material for a study investigating the relationship between hedonic preference for sweet taste and economic decision-making, including risk-taking and delay discounting. The study involved two experiments with 49 and 100 participants, respectively.
Shuifeng Hong's research dataset investigates spillover effects and co-movements among carbon, energy, and industrial metals markets. The analysis employs a multidimensional framework using cointegration, a discrete wavelet-based Diebold-Yilmaz model, and wavelet coherence analysis. The dataset is a 690.5 KB DOCX file published on figshare under a CC-BY-4.0 license.
East Antarctica's Prydz Bay-Lambert Graben region contains a stratigraphic record of Cenozoic glacial cycles. The review, sourced from the Australian Ocean Data Network, documents at least 10 intervals of glacial advance and over 17 intervals of glacial retreat. It presents a partial reconstruction of glacial extent that can be compared to eustatic sea-level records from the southern Australian margin.
Regional-scale magnetic data from the Yukon territory, covering a mapped area of approximately 2000 kmΒ², was used to model serpentinized ultramafic rocks. The Government of Yukon published this work in April 2026 to identify potential sites for carbon mineralization. Results estimate local volumes up to 361 kmΒ³ and a total CO2 storage capacity exceeding 1600 gigatonnes.
Preliminary observations from four gold deposits in the Tombstone gold belt of Yukon's Selwyn basin. The dataset includes field and petrographic observations on host rocks, veins, and alteration patterns, comparing them to porphyry deposit models. It was published by the Government of Yukon and last updated on April 17, 2026.
Yukon Geological Survey's annual publications highlight geoscience research and mining activity from the previous year. The volumes include an overview of staff activities and industry summaries, plus technical papers on various projects. All publications are available in digital format from the Yukon government website.
Limpopo Province, South Africa, is the focus of this quantitative experimental study evaluating a single entrepreneurship training intervention for start-up owners and managers. The study involved 10 cohorts of approximately 30 participants each over a 6-month period, using structured questionnaires to measure impact. The dataset was authored by Ramatsobane Mpe and last updated on 2026-05-20.
Version 05-2026 provides a standardized archive of over 1,000,000 industrial MRO and scientific product records, curated by QTE Technologies. The dataset focuses on cross-border technical entity alignment and is part of the Open Industrial Knowledge Ecosystem project. It was last updated on 2026-05-03.