Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,939 datasets
Graph data and video metadata from TikTok focusing on niche aesthetic communities. The dataset was collected by Dominique Veloso via automated scraping of hashtag pages in March and April 2026. It is hosted on the Harvard Dataverse platform.
Session traces document interactions of a coding agent with the 'pi-diff-review' GitHub repository. The data was collected by badlogicgames, exported with pi-share-hf, and last updated in April 2026. Each session is a JSON Lines file filtered through deterministic redaction and LLM review.
Strathbogie Shire Council provides polygon data of work zones used by its Operations Department for works within the Local Government Area. The dataset includes zone names and is available in multiple geospatial formats including GeoJSON, KML, and SHP. It was last updated in March 2026 by the Strathbogie Shire Council.
Synthetic datasets for extracting emotion and emotion-deflection probes from large language models, built from the methodology described in Anthropic's 'Emotion Concepts and their Function in a Large Language Model' (Sofroniew et al., April 2026). The dataset includes files such as 'expression/stories.parquet' with 205,200 rows of emotional stories across 171 emotions and 100 topics, generated by models like Gemini 3.1 Pro Preview. It was authored by ryancodrai and last updated on 2026-04 08.
PubTables-v2 is a new large-scale dataset designed for full-page and multi-page table extraction tasks. The dataset was created by rohanSingh969 and officially released on Hugging Face in February 2026. It is organized into three distinct collections, each containing tables in a specific context.
20 Italian regional capitals are covered by a multidimensional cultural-density index. The dataset likely contains composite metrics for comparing urban quality of life across Italy. The data originates from Kaggle, but the author, organization, and specific collection details are unknown.
Supplementary material for a scoping review on simulated clinical placements. The dataset is an Excel file published on figshare by Danielle Pollock under a CC-BY-4.0 license, last updated in April 2026. Its 620.8 KB size suggests a limited scope of structured evidence.
Supplementary material from a cross-sectional study on postpartum depressive symptoms and associated factors among women with lactation mastitis. The dataset is an 11.7 KB XLSX file published on figshare by Jianrong Li under a CC-BY-4.0 license, last updated on 2026-04-11.
Supplementary Material 1 for a systematic review and meta-analysis protocol evaluating the Hope Index (HI). The 22.1 KB ODT file was authored by Jermaine M. Dambi and published on figshare under a CC-BY-4.0 license. It was last updated on 2026-04-11.
Ultrasonic velocity measurements for three distinct granites are recorded against varying differential pressure. The dataset, created by researcher Heming Wang, is a 13.1 KB file last updated in April 2026. It provides a focused look at how pressure influences wave propagation in these specific rock samples.
From September 6th to September 11th, 2004, this dataset contains 72 full-depth hydrographic casts collected along the Line W section in the Northwest Atlantic. It includes measurements of conductivity, temperature, pressure, dissolved oxygen, and chlorofluorocarbons (CFCs 11, 12, 113) from CTD sensors and calibrated water samples. The data supports analysis of oceanographic properties and transient tracer distribution.
Twenty-one full-depth hydrographic casts were collected along the Line W transect between New England and Bermuda in October 2006. The casts include calibrated CTD profiles, LADCP current measurements, and water samples analyzed for salinity, dissolved oxygen, and chlorofluorocarbons (CFCs 11, 12, 113). This dataset provides a snapshot of water mass properties and transient tracer concentrations in a key region of the Northwest Atlantic.
MOODS is the standard U.S. Navy database for oceanographic observations, containing thermal, chemical, biological, and seafloor data collected by agencies worldwide. The specific dataset comprises five ASCII files of depth and temperature profiles submitted between August 1991 and March 1995. It is maintained by the Naval Oceanographic Office (NAVOCEANO).
Conductivity, Temperature, Depth (CTD) and barometric pressure data were collected in the Northeast Pacific Ocean during two research cruises aboard the ship WECOMA. The dataset includes conventional CTD data from 100 casts and towed SEASOAR CTD data from 165 segments, submitted by Dr. Adrianna Huyer of Oregon State University. Data collection spanned from June 7 to September 20, 1993 as part of the Eastern Boundary Currents Accelerated Research Initiative.
Northeast Atlantic data from 100 hydrographic stations south of the Azores, collected aboard R/V ENDEAVOR from May 1-19, 1987. The dataset contains CTD-derived pressure, depth, temperature, salinity, dissolved oxygen, and potential temperature, alongside measurements of dissolved chlorofluorocarbons Freon-11 and Freon-12. It was provided by Dr. T. Joyce of the Woods Hole Oceanographic Institution.
From April 9th to 13th, 2007, 17 full-depth hydrographic casts were collected along Line W in the Northwest Atlantic. The dataset includes calibrated measurements of conductivity, temperature, pressure, dissolved oxygen, and chlorofluorocarbons (CFCs 11, 12, 113) from shipboard CTD sensors and rosette water samples. It forms part of a series of sections monitoring ocean properties between New England and Bermuda.
Sixteen full-depth hydrographic casts were collected along Line W in the Northwest Atlantic between October 13 and 17, 2005. The dataset includes conductivity, temperature, pressure, dissolved oxygen, and chlorofluorocarbons (CFCs 11, 12, 113) from CTD sensors and calibrated water samples. Measurements support analysis of water column properties and transient tracer distribution.
Survey data from 136 Generation Z respondents in Indonesia with prior TikTok Live Shopping experience, collected to study impulsive buying behavior. The dataset supports a quantitative study analyzing the effects of host performance, emotional euphoria, and perceived quality value. It was created by Purnomo, Aleah Prameswari Kalyana Merkadea and last updated on 2026-04-16.
The Global Argo Data Repository archives monthly profile data collected by Argo profiling floats since September 1995. NOAA NCEI operates the repository, containing real-time and delayed-mode profiles of ocean temperature, salinity, pressure, conductivity, dissolved oxygen, nitrate, pH, chlorophyll a, backscattering, and downwelling irradiance. Data collection began in 1995 and monthly archiving started in the second quarter of fiscal year 2003.
A 2021 meta-analysis synthesizes research on the vulnerability of Southern Ocean marine calcifiers to ocean acidification. The study projects a 0.3 pH decline by 2100 and correlates biological response variation with skeletal mineralogy. It combines a literature review with quantitative analysis of species traits and environmental factors.