Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
44,493 datasets
SoE2015 data from the Queensland Department of Environment, Tourism, Science and Innovation reports on interstate commercial and industrial waste. The dataset indicates that approximately 22,000 tonnes of such waste were transported to Queensland landfills for disposal in the 2014-2015 financial year. It was published by [email protected] and last updated in May 2026.
SoE2017 data from Queensland's Department of Environment, Tourism, Science and Innovation reports per capita household waste generation. It indicates an average of 556kg per person in 2016-2017, with regional figures ranging from 390kg in Cairns to 680kg in Remote Queensland. The dataset was last updated on 2026-05-27.
A 30.2 KB DOCX file provides supplementary material for a review summarizing clinical trial evidence for mineralocorticoid receptor antagonists in dialysis patients. The review covers biological rationale, pharmacokinetic studies, and results from trials like ALCHEMIST (n=644) and ACHIEVE (n=2,538). The material was authored by figshare admin karger and last updated on 2026-04-15.
Multibeam survey data acquired by the NSW government's Research Vessel Bombora between 20/APR/2021 and 30/APR/2021. The dataset contains 32-bit floating point geotiff files of bathymetry and backscatter in 5m resolution, processed through Hypack, R2Sonic GUI, POSView, POSPac, Qimera, and FMGT software. This work was funded by the SeabedNSW program and HabMap Program to provide a baseline dataset and map seabed types.
Evaluation reports generated by Global Affairs Canada to review the performance of its priorities, programs, and projects. The information gathered is used to improve the design and implementation of upcoming programs and initiatives. Reports are published on the open_canada platform under the OGL-CA-2.0 license.
Bucaramanga's public environmental policy matrix consolidates budget execution and demographic beneficiary data for climate adaptation and mitigation actions. The dataset includes columns for financial resources, program objectives, and beneficiary counts across demographic groups. It was published on the datos.gov.co platform and last updated on 2026-05-18.
A curated collection of 4.79 million Wikipedia articles spanning the 2008 and 2010 snapshot releases, cleaned and compressed for efficient large-scale language model pretraining. This dataset preserves the raw encyclopedic knowledge of two distinct eras of Wikipedia, making it valuable for temporal analysis, knowledge evolution research, and foundation model training. It was created by author 'adhyanshaa' and last updated on 2026-06-04.
Sensor data collected for the Port Curtis Integrated Monitoring Program in Zone 05 - Inner Harbour. The Australian Ocean Data Network manages this time-series dataset covering a 20-year period from 01 July 2006 to 26 March 2026. The data was last updated on 05 June 2026.
Evaluation reports from Global Affairs Canada reviewing the performance of five maternal, newborn, and child health projects in Haiti. The reports serve as a practical management tool to improve the design and implementation of future programs and initiatives. Each evaluation results in a generated report, with the dataset last updated on the platform in May 2026.
Evaluation reports from Global Affairs Canada's Partnerships for Development Innovation Branch between 2015-16 and 2019-20. These documents serve as a practical management tool for reviewing the performance of programs and activities. The information gathered through each evaluation is intended to improve the design and implementation of upcoming programs and initiatives.
2700 km of industry-standard seismic data was acquired in late 2004 as part of Geoscience Australia's Southwest Frontiers Survey. The survey aimed to define basement composition and crustal thickness to constrain tectonic evolution and hydrocarbon maturation models. Results indicate sediment thickness exceeding 9 km and basement velocities of 5.2-5.6 km/s, suggesting a non-granitic composition.
synCUB is a synthetic, paired-image benchmark for evaluating concept-based interpretability. Each item is an (original, synthetic) image pair that differs in exactly one CUB attribute, such as changing a breast pattern from solid to spotted. Images are generated with FLUX.2 [dev] conditioned on CUB reference images, accompanying a paper on evaluating interpretability methods.
236.5 KB of molecular structure data from a study identifying a hit compound, 4p, with an IC50 of 0.27 μM for biofilm inhibition. The dataset, authored by Ying-Bo Zhou and last updated in May 2026, contains results from in vitro and in vivo experiments demonstrating synergy with antibiotics. It likely includes structural data for a series of synthesized catechol-conjugated benzothiazole derivatives.
Supplementary PDF files for the research article 'Can personal freedom drive economic complexity?'. The study tests the hypothesis that personal freedom boosts economic complexity using a panel of 139 countries over the 1998-2022 period. Author Vitor Castro published the files under a CC-BY 4.0 license on figshare.
A 5-month expert consensus study from April to August 2024 derived updated job competencies for Korean occupational therapists. The work involved a 3-round online Delphi survey with 20 experts and a focus group meeting with 10 experts to refine and validate the competencies. The dataset was created by Byoung Gin Jeon and hosted on Jeehp Dataverse.
Global Affairs Canada periodically conducts evaluations of its priorities, programs, and projects. A report is generated for each evaluation to review performance and inform the design of upcoming initiatives. The dataset was last updated on 2026-05-21.
Wide-angle seismic data from ocean bottom seismographs, gravity data, and deep marine reflection profiling data define crustal-scale features along the Vulcan transect in northern Australia. The dataset, sourced from the Australian Ocean Data Network, outlines the crustal and upper mantle architecture between the Precambrian Australian craton and the Timor Trough. It includes interpreted crustal thickness values, such as 35 km near the coast and 26 km under the outer shelf.
Yu Wang provides five supplementary Excel files supporting a manuscript under peer review. The files, totaling 105.1 KB, contain materials for indicator construction, evidence mapping, corpus and modeling diagnostics, fusion weighting, and sensitivity analyses. They are shared under a CC-BY-4.0 license to facilitate peer review and verification.
A 15-kilometer resolution geospatial dataset maps land cover across the Former Soviet Union. It contains sixty distinct land cover classes, with a specific focus on forest types accounting for 38 of those classes. The dataset was produced by the National Aeronautics and Space Administration and covers the period from 1984 to 1993.
Forest cover data provides a 1:2 million scale map for the Krasnoyarsk Region in Russia, distinguishing thirty-two land cover classes. The dataset was digitized from maps published in the Atlas of Forests of the USSR in 1973. It is hosted by the National Aeronautics and Space Administration and is available across multiple platforms.