Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
43,693 datasets
Evaluation reports from Global Affairs Canada's periodic reviews of its priorities, programs, and projects. The reports serve as a management tool for reviewing program performance and improving the design of future initiatives. The dataset consists of individual HTML reports generated for each evaluation.
A dataset of SFT (supervised fine-tuning) trajectories created by internlm for the RNGBench evaluation framework, as described in a paper from 2026. It is designed to test multimodal language models' ability to reconstruct hidden state from memory and act on it in closed-loop environments. The dataset was last updated on 2026-06-23.
Nine facsimile map sheets covering the area around Berlin, reduced from original 1:25,000 surveys. The sheets were produced under the direction of Carl von Decker between 1816 and 1821 and are held by the Staatsbibliothek zu Berlin. The set covers an area from Kremmen and Oranienburg in the north to Mittenwalde and Storkow in the south, and from Nauen and Ketzin in the west to Strausberg in the east.
Two decades of yeast diversity surveys in China yielded 195 isolates, leading to the proposal of 77 new species and 2 new genera. The dataset likely contains phylogenetic analyses based on LSU D1/D2 and ITS rDNA sequences, with species delimitation using Average Nucleotide Identity (ANI). It was authored by huihui zhu and published on figshare under a CC-BY-4.0 license.
Data from February 23 to March 31, 2015, includes Level 1B and Level 2 products from the MODIS/ASTER Airborne Simulator (MASTER) instrument. The dataset contains calibrated radiance imagery across 50 spectral bands and derived products like land surface temperature and emissivity, collected during 10 NASA ER-2 flights over Greenland, the U.S., and Canada. It was produced by ORNL_CLOUD for the Suomi-NPP instrument validation campaign.
ACE-SQL Training Data is a collection of curated data for supervised fine-tuning, reinforcement learning, and empirical-pool tasks. It was created by xiaobing11 and released on the Hugging Face platform. The data is intended for training a shared language-model policy for text-to-SQL tasks.
A quasi-experiment by Bo Yuan measuring Chinese English as a Foreign Language learners' engagement across cognitive, emotional, and behavioral dimensions. The dataset likely contains pretest, posttest, and delayed posttest scores after exposure to single or multiple gamification elements. It is a small dataset of 7.3 KB, last updated on 2026-05-26.
Active short-term rental licenses across Austin, Texas, include general neighborhood and zip code location data. The dataset provides street name and zip code but excludes specific addresses for resident safety. It is maintained by Austin Development Services and was last updated in April 2026.
Q 5 is a PDF map of quadrant 5, provided by the North Sea Transition Authority via the uk_data platform. The dataset was last updated on 2026-06-18 11:14:25.201639. It is accessible via ARCGIS GEOSERVICES REST API and HTML formats.
Pyrolysis and bulk kinetic studies investigate hydrocarbon generation potential and source rock facies variability of marine organic-rich rocks from the Middle Ordovician Goldwyer Formation in the Canning Basin, Western Australia. The dataset likely contains results from Rock Eval pyrolysis and pyrolysis gas chromatography (Py-GC) for immature to mid-mature calcareous mudstones. The data was published in the International Journal of Coal Geology in 2020.
659.8 KB of research data from figshare details the rational design and testing of a novel dual-target inhibitor for pulmonary arterial hypertension. Author Mengqi Li published the data on 2026-05-11, describing the compound 15n, which targets Hsp110 and HDAC6 to synergistically block vascular remodeling. The dataset likely contains structure-activity relationship data and results from in vitro and in vivo experiments.
A 2026 review document by Sonali Thangavel, published on figshare under a CC-BY-4.0 license, examining recent advances in conductive aerogels for electromagnetic interference shielding. The 868.7 KB document discusses mechanisms, materials like carbon nanostructures and MXenes, and scalable fabrication methods. It provides a systematic perspective on structural design, compositional engineering, and the potential for flexible and environmentally friendly applications.
A series of evaluation reports for Global Affairs Canada's engagements in complex environments in Sub-Saharan Africa from 2018 to 2023. The reports serve as a practical management tool for reviewing program and project performance. This information is used to improve the design and implementation of upcoming programs and initiatives.
Global Affairs Canada periodically conducts evaluations of its priorities, programs, and projects. This dataset contains the evaluation report for the Weapons Threat Reduction Program (WTRP) covering the period from 2018-19 to 2023-24. The report serves as a management tool for reviewing program performance and improving future design and implementation.
A processed geophysical grid of total magnetic intensity (TMI) with reduction to pole and first vertical derivative applied, covering the Narryer survey area in Western Australia. The grid has a cell size of approximately 20 meters and is given in units of nT per metre, derived from 415,090 line-kilometres of data acquired in 2024 by the WA Government. Geoscience Australia processed and quality-checked the data, which reveals subsurface geological structures.
Geoscience Australia Data presents geological mapping results from the Vestfold Hills, East Antarctica, to support evidence-based management of Antarctic Specially Protected Areas. The work, presented at the 2024 Australian Antarctic Research Conference, uses regional geological mapping and field observations to recommend access points and reduce damage to fragile fossil fauna. Results are informing the revision of the ASPA No. 143 Marine Plain management plan for consideration at Antarctic Treaty meetings.
2024 magnetic survey data from the Narryer region in Western Australia, consisting of 415,090 line-kilometres of airborne measurements. The grid, processed by Geoscience Australia, shows the first vertical derivative of the total magnetic intensity after reduction to the pole, with a cell size of approximately 20 meters. This data can be interpreted to reveal the geological structure of the subsurface.
Twelve indicators from the Utrecht Population Survey measure resident perceptions of housing and neighborhood quality. The dataset is part of the 'Utrecht in Cijfers' database published by the municipality's Research & Advice department. It includes metrics like the percentage of residents who find their home too small or their neighborhood unpleasant.
Eight indicators on environmental perceptions in Utrecht, sourced from the municipality's Population Survey. The dataset includes percentages of residents reporting issues with noise from traffic and businesses, odour, and air pollution within neighbourhoods. It is published by the Research & Advice department of the Municipality of Utrecht.
11 indicators on public space and greenery in Utrecht, Netherlands, published by the municipality's Research & Advice department. The dataset includes metrics such as trees per 1,000 inhabitants, area of usable green space, and citizen satisfaction with nearby parks and play areas. Data sources include municipal surveys and city works records.