Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,733 datasets
A computational methodology for extracting structured information from 30 Mission 300 National Energy Compact documents. The pipeline automates the location and organization of textual passages relevant to five assessment dimensions, supporting empirical scoring for a related paper. The dataset, authored by Mariana Mazzucato, was last updated on April 20, 2026.
An inventory of public information generated, obtained, acquired, or controlled by obligated entities in Colombia that has been classified as confidential or reserved. The dataset includes metadata such as classification dates, legal justifications, and responsible parties. It is hosted by datos.gov.co and was last updated on 2026-05-18.
SHOW3D is a large-scale multi-view dataset of hand–object interactions captured in the wild. It was created by researchers from Facebook and published at CVPR 2026 to advance research on egocentric 3D hand–object interaction understanding and model generalization to real-world scenarios.
A metadata schema for publications under Colombia's Transparency and Access to Public Information Law 1712 of 2014, managed by the Personería de Manizales. The dataset includes fields for titles, media, responsible parties, and formats. It is hosted on the Colombian open data portal and was last updated in May 2026.
A dataset from figshare, last updated on 2026-05-04, containing examples of a new class of photoswitchable molecules called photovermellogens. The data was authored by Alejandro Vila and includes results from spectroscopic and computational studies of these compounds, which exhibit a change in acidity upon light irradiation.
NASA research data from a study exposing 14-week-old male Wistar rats to varied levels of partial weight bearing (20%, 40%, 70%, 100%) for 1, 2, or 4 weeks. The dataset contains results from in vivo pQCT (peripheral Quantitative Computed Tomography) assays on tibia tissue, measuring trabecular bone density and structural alterations. The study was last updated on 2026-03-13.
NASA researchers measured structural and functional skeletal alterations in 14-week-old male Wistar rats exposed to varied levels of partial weight bearing. The dataset includes results from in vivo pQCT, ex vivo microcomputed tomography, histologic analyses, and three-point bending tests on femur tissue. Data was last updated on March 13, 2026.
Replication data and code for the paper "Regime-Conditional Crude Hedging for Indian Refiners: Evidence from a 16-Year Backtest." The archive includes daily and monthly Brent crude oil price panels in Indian rupees, regime-classifier outputs, and backtest results for nine hedging policies from January 2010 to December 2025. It was authored by Adesh Mishra and published on Harvard Dataverse in June 2026.
A document retention schedule for physical and electronic records generated by the AAA La Bellezana Public Cooperative Administration. The dataset lists document series, retention periods, and final disposition actions. It is published by www.datos.gov.co and was last updated on May 18, 2026.
11 alfalfa populations were evaluated for 10 morphophysiological traits and characterized using genotyping-by-sequencing and DArTag markers. The study compares distinctness criteria for variety registration, finding molecular methods achieved complete population separation where morphological traits did not. Author Paolo Annicchiarico published the findings on figshare in April 2026.
A 2026 case study by Paolo Annicchiarico compares morphological and molecular distinctness for 11 alfalfa populations. The data likely contains results for 10 morphophysiological traits and molecular markers from GBS and DArTag panels for these populations. The study supports molecular distinctness as a tool for variety registration, forensic analysis, and seed control.
A study by Hesham Aldamen examined the effect of AI-based text adaptation on reading comprehension for 48 university EFL students. The data, published on figshare in 2026, includes original and adapted versions of source texts used in a controlled experiment. The study employed a 2x2 mixed factorial design to test performance across different proficiency levels.
A 579.1 KB DOCX file contains supplementary data from a study on Modified Gegen Qinlian Decoction (MGQD) for ulcerative colitis. The research, authored by Meiling She and last updated in April 2026, involved experiments on pathogen-free and pseudo-germ-free mice infected with Veillonella parvula. Data likely includes results from disease activity indices, colon length measurements, histology, and analyses of intestinal mucosal barrier integrity.
Registry Trust Ltd maintains the official statutory Register of Judgments, Orders, and Fines for England & Wales and similar registers for Scotland, Northern Ireland, Republic of Ireland, Isle of Man, and Jersey. The data on monetary judgments, including County Court Judgments (CCJs), supports millions of lending and credit decisions annually. The dataset was last updated on 2026-04-29 and is published under the OGL-UK-3.0 license.
A 710.8 KB guide authored by Antonio F. Galvao, offering practical recommendations for applied researchers on estimation and inference for panel quantile models. The guide, last updated on 2026-04-27, is available in PDF, BST, BIB, and TEX formats and is licensed under CC-BY-4.0. It addresses implementation challenges and illustrates existing approaches for econometricians and statisticians.
Denmark-based researchers from Rigshospitalet and the Technical University of Denmark created the Everyday Conversational Danish Sentence Test (ECO-DAST). The 252.1 MB dataset includes materials for running the speech-in-noise test and experimental data used for its psychometric characterization, as published in The Journal of the Acoustical Society of America in 2026.
A research article published on figshare by Ramiro Vázquez, last updated on 2026-04-21. The study investigates the role of the histamine H4 receptor in tumor progression and immune response using a 4T1 cell line murine model of breast cancer. The file is a 473.7 KB PDF containing the full research paper.
A research document details the role of the histamine H4 receptor in breast cancer progression using a murine model. Ramiro Vázquez authored this study, which was last updated on 2026-04-21. The 16.0 KB DOCX file contains findings on tumor cell proliferation, immune response, and energy metabolism.
A 1.2 MB PDF data sheet describes a pipeline for analyzing neutrophil dynamics. The pipeline uses time-lapse sequences from a confocal microscope, processed with U-Net for segmentation and tracking, and an extended Viterbi algorithm for trajectory linkage. The data sheet was authored by Chen Li and last updated on 2026-04-21.
A list of small transient events in the solar wind, identified by the STEREO-A spacecraft's PLASTIC instrument using criteria including event duration (0.5 to 12 hours), magnetic field strength (>1.3 times yearly average), low proton beta (<0.7 times yearly average), and low Alfvén Mach number (<0.7 yearly average). The dataset is provided by the National Aeronautics and Space Administration and was last updated in March 2026.