Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,373 datasets
Countering Disinformation Using Bottom-Up Strategies is a study on everyday citizen-initiated strategies for debunking social media disinformation. The dataset was authored by Comfort Umoren-Olorunnisomo and was last updated on June 8, 2026.
Water pressure zone boundaries for a municipal water system. The dataset includes a MODIFIED_DT column indicating when boundaries were last updated. Data is provided by data.calgary.ca and was last updated in April 2026.
Virienia Puspita's research analyzes a public communication campaign by PT PLN in Central Java, Indonesia, aimed at increasing community trust and acceptance of electricity transmission infrastructure. The study uses an embedded case study approach with quantitative evidence, integrating Integrated Marketing Communication (IMC) and Communication Accommodation Theory (CAT) frameworks. Findings show a multi-channel (PESO) strategy sensitive to local context expanded audience reach to millions and improved public understanding of transmission safety.
OBLIQ-Bench is a suite of five retrieval benchmarks designed to expose a blind spot in modern search systems. It focuses on oblique queries where relevance-determining attributes are latent and have little surface expression in documents. The dataset was created by author 'dianetc' and last updated on Hugging Face in May 2026.
Seventy-four qualitative, mixed-methods, and review studies on survivors' experiences of meaning-making following sexual abuse were analyzed. The review synthesized findings from 2,158 deduplicated references across six academic databases. It generated twelve themes grouped into three overarching categories: a changed self, reshaping relationships with others, and mapping out a future self and world.
Geoscience Australia Data published a legacy document reviewing the 1960 activities of metalliferous field parties. The report is available in PDF and HTML formats, but its abstract and detailed contents are not provided in the metadata. This historical record likely contains summaries of geological fieldwork and mineral exploration efforts from that year.
Eleven randomized controlled trials involving 1,575 participants were analyzed in this systematic review and meta-analysis. The study, authored by Xin Wei and published on figshare in 2026, evaluates the effects of Cognitive Behavioral Therapy on anxiety, depression, and sleep quality in patients following myocardial infarction. It follows PRISMA 2020 guidelines and includes a systematic review registration identifier.
AI-generated images of USA passports designed for training machine learning models. The dataset contains 9,600 synthetic images with varied angles, lighting, and backgrounds, created by ud-synthetic. It was last updated on May 3, 2026.
Peb3 transcription data from the bacterial pathogen Campylobacter jejuni. The dataset compares wild-type and mutant strains under various growth conditions. It was contributed by author Hendrixson, David and harvested from the Texas Data Repository.
Supplementary Material 3 from a structured literature review on influenza-associated pulmonary aspergillosis. The 155.7 KB XLSX file, published on figshare by Sarah Sedik, likely contains supporting data for the review's analysis. Its content was last updated on 2026-05-05.
A Data Management and Sharing Plan outlines the scientific data to be generated and used in a research project. The plan describes a strategy for managing and sharing project data related to developing a vulvar film applicator for precision drug delivery in vestibulodynia. It was authored by Shawn Hingtgen and last updated on June 8, 2026.
Reports from the States site of Ukraine, including numerical definitions and satisfaction results for public information requests. The dataset contains identifiers, document numbers, registration dates, and summaries for requests reviewed by standing commissions. It was last updated on 2026-05-04.
Records from cases brought for review by the Cook County State's Attorney's Office. Each row represents a potential defendant, with data fields for demographics, incident details, and case processing. The dataset is no longer actively maintained as of December 30, 2024.
Data on the number of community radio stations and the revenues and expenses of commercial radio stations, sourced from the European Union's open data portal. The dataset is provided by the UK Government Digital Service under an Open Government Licence. The specific temporal and geographic coverage is not detailed in the available metadata.
Hourly meteorological data files from instruments co-located with Global Navigation Satellite System (GNSS) receivers, provided by NASA's Crustal Dynamics Data Information System (CDDIS). The dataset contains one day of meteorological observations (temperature, pressure, humidity, etc.) per site in RINEX format from a global permanent network. Data includes information from multiple GNSS constellations such as GPS, GLONASS, Galileo, and Beidou.
Geomorphic Wetlands Manjimup to Northcliffe provides location, boundary, and geomorphic classification for wetlands in that specific Western Australian area. The dataset classifies wetlands by their host landform and hydroperiod, but excludes conservation significance evaluations. It is an unreviewed dataset from the Department of Biodiversity, Conservation and Attractions, last updated in March 2026.
Ministerie van Binnenlandse Zaken en Koninkrijksrelaties provides this dataset showing greenhouse horticultural areas and complexes in the province of Drenthe. The data is available under a CC-PDM-1.0 license. The specific temporal coverage and data collection method are not detailed in the provided metadata.
Romda is a cultural-historical exploration dataset for the Drentse Aa/Elperstroom border area, sourced from the EU Open Data platform. The dataset is provided by the Dutch Ministry of the Interior and Kingdom Relations and is licensed under CC-PDM-1.0. Available file formats include ZIP and PNG, suggesting the data likely contains geospatial imagery or maps.
Supplementary material from a computational drug repositioning study for Streptococcus pyogenes and influenza A coinfections. The dataset contains mapped read counts and lists of differentially expressed genes, authored by Kevin Strey and last updated in April 2026. It is provided as a 3.2 MB XLSX file under a CC-BY-4.0 license.
Pre-extracted Kyutai Mimi tokens for the Expresso speech dataset. The dataset contains tokens from all 32 codebooks for both read and conversational subsets, intended for training Mimi-based speech models. It was created by shangeth and published on Hugging Face, with a last update timestamp of 2026-05-03 12:54:40.