Loading...
Loading...
Text classification, translation, QA, summarization, dialogue, sentiment analysis, language modeling, text corpora
43,991 datasets
Over 15 columns detail use of force incidents reported by the Cincinnati Police Department, including incident type, officer and subject demographics, and neighborhood. The data is sourced directly from the Cincinnati Police Department but is currently frozen and will not be updated due to a system migration. A public dashboard on CincyInsights provides interactive visualization of this data.
Three separate 0.5m resolution bathymetric grids for surveyed sites in Cairns, Queensland, exported as 32-bit floating point GeoTIFFs. The Australian Hydrographic Office acquired this data on 27 May 2020 for calibrating multibeam echosounders. Data was processed using Caris HIPS & SIPS and is provided in MSL, LAT, and Ellipsoid vertical datums.
Li Wang's supplementary file from 2026 details the generation of the iVero.219-mcRTA cell line for Kaposi’s sarcoma-associated herpesvirus (KSHV) research. The document reports quantitative performance metrics, including a viral DNA copy number of (5.7 ± 0.1) × 10⁴ genome copies/mL and an infectious titer of (2.4 ± 0.1) × 10⁴ IU/mL upon doxycycline induction. This 254.0 KB document is licensed under CC-BY-4.0 and was last updated on 2026-04 13.
Supplementary information files accompany the article "Bilinear secants and birational geometry of blowups of P<sup>n</sup> x P<sup>n+1</sup>". The files, authored by Elisa Postinghel and licensed CC-BY-4.0, were last updated on 2026-05-13. They likely contain data supporting the study of bilinear secant varieties and the computation of effective and movable cones for specific blowups.
275.3 KB of data from a study combining behavioral assays and molecular approaches to understand the visual system of the Black Grouse (Lyrurus tetrix). Research by Simon Potier, Marjorie A. Liénard, and colleagues informs strategies for reducing bird collisions with aerial infrastructure. The dataset includes files in RMD, XLSX, FASTA, CSV, and RTF formats.
PHANTOM is a large-scale, open-source dataset of pre-generated multimodal adversarial attacks. It provides ready-to-use adversarial image–text pairs targeting a wide range of harmful intents for evaluating VLM robustness. The dataset was created by author 'it4lia' and was last updated on Hugging Face in June 2026.
A seamless topographic color map covering all of Australia and its external territories, including Norfolk, Lord Howe, Macquarie, Cocos (Keeling), Christmas, Heard, McDonald Islands, and the Australian Antarctic Territory. The service integrates data from Geoscience Australia, the Australian Antarctic Division, OpenStreetMap, the Australian Bureau of Statistics, and Natural Earth, with topographic information checked in 2008 and supplemented in 2009. It portrays cultural, hydrography, marine, transport, vegetation, and relief features, using SRTM data acquired by NASA in February 2000.
Special physicochemical water quality data collected by the RSMA (Network for Monitoring the Aquatic Environment) around Montreal. Samples are taken with a polyethylene bucket, stored on ice, and analyzed in a laboratory for parameters like pH, temperature, dissolved oxygen, and conductivity. The dataset is published by the Government and Municipalities of Québec under a CC-BY-4.0 license.
Colombian data tracking compliance with Circular 1552 of 2013, an anti-red tape law requiring health providers to maintain open appointment schedules. The dataset likely contains monthly metrics on wait times and appointment volumes for medical specialties. It is hosted by datos.gov.co and was last updated on 2026-05-18.
An inventory of public information generated, obtained, acquired, or controlled by the ESE Metrosalud that has been classified as confidential or reserved. The index was adopted via Resolution 3788 of 2019. It is available in CSV, JSON, XML, and RDF formats from the www.datos.gov.co platform.
Dampier Marine Park in Western Australia was surveyed for bathymetry data between July 2024 and January 2025. The survey was conducted by DMAL for the Australian Hydrographic Office under the Hydroscheme Industry Partnership Program. This dataset is not intended for navigational purposes.
A 21.8 MB audio file (WAV format) uploaded to figshare by Marie-Annick Moreau. The description narrates a scene where Bumbo finishes a trap chamber and secures it with a palm leaf tie, followed by fence maintenance with Abdalah Saidi Mwingo. The dataset was last updated on June 3, 2026.
A 28.7 KB PDF document authored by Marie-Annick Moreau, last updated on June 3, 2026. The text describes an ethnographic observation of an individual named Bumbo finishing a trap chamber and checking a fence, with Abdalah Saidi Mwingo performing similar actions.
Marie-Annick Moreau uploaded a 30.2 MB audio file (WAV format) to figshare on 2026-06-03. The recording features Turo explaining and demonstrating the use of a hand-made scoop net, called 'njechele' or 'tinindi', for collecting fish inside a trap chamber. The net is described as having an oval shape designed to fit the corners of the trap.
An ethnographic video by Marie-Annick Moreau, last updated on 2026-06-03. Turo demonstrates a handmade scoop net called 'njechele' or 'tinindi', showing its construction from vine and thread and its use for collecting fish inside a trap chamber. The dataset is a 32.0 KB EAF file, likely containing video annotation data.
Global Sat Metar is a dataset of dense global satellite imagery paired with sparse global METAR station observations. The data is rasterized onto a 3600×1800 grid and sliced into 128×128 spatial patches with a 7-frame hourly temporal context. Created by meteolibre-dev, it is designed as a pre-training corpus for weather foundation models.
A 45.3 KB PDF documents Mzee Kulenga explaining the technique for adding inner hoops to a basket trap. The description details the use of bright nylon thread for durability and the method of leaving a finger space between each spoke. Authored by Marie-Annick Moreau, the material was last updated on June 3, 2026.
High-resolution power consumption traces for generative AI workloads executed on the Kestrel HPC platform. The dataset includes profiles for inference and training jobs across LLM and image generation tasks, with varying datasets and node counts. Published by the Department of Energy in 2026, it aims to address the lack of open empirical energy data for AI systems.
Verified distillation traces generated with faststill v0.0.1. The dataset contains (prompt, reasoning, output) triplets from an OpenAI-compatible chat-completions endpoint, with each row verified by a machine check before inclusion. The dataset was created by empero-ai and last updated on 2026-06-16.
Experimental data on zinc and magnesium ion co-doping ratios in bioactive glass for bone repair. The dataset likely contains results from an in vitro study investigating five specific doping ratios (Zn/Mg: 0%/20%, 5%/15%, 10%/10%, 15%/5%, 20%/0%). It was authored by Shalitanati Wuermanbieke and published on figshare in April 2026.