Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,008 datasets
Kaggle hosts a visualization reference panel for the Orbit Wars evaluation. The panel likely contains 32 archetypes used to visualize or categorize a larger set of 128 seeds. The dataset's creator, organization, and update date are unknown.
Newsmokefiredata is a dataset hosted on Kaggle. The title suggests it likely contains information linking news media reports to wildfire or smoke events. Specific details on its size, origin, and collection methodology are not provided in the available metadata.
Replication data for a forthcoming article in the Review of Economics and Statistics. The dataset, authored by Wenzhuo Lu, was last updated in April 2026. It supports analysis of how globalization and innovation influence firms' sourcing decisions.
A dataset likely containing measurements related to the compressive strength of concrete. It was published on the Kaggle platform. The specific content, size, and origin require verification after download.
A collection of news articles likely labeled for authenticity. The dataset is hosted on Kaggle, but its specific size, origin, and creation date are unknown. Columns and detailed content require verification after download.
A collection of news articles labeled as real or fake, hosted on Kaggle. The dataset likely contains textual content for binary classification tasks. Metadata is minimal; specifics about size, source, and time period are unknown.
A dataset concerning fake reviews, likely used for training or evaluating detection models. It appears to be a specific methodological split, labeled 'Method B', from a larger project or course module. The dataset is hosted on Kaggle, but its exact size, origin, and creation date are unknown.
Sept_8_9_BBCNewsNepali_Facebook_Posts is a collection of social media posts from the BBC News Nepali Facebook page. The dataset likely contains text posts published on September 8 and 9 of an unspecified year. It is hosted on the Kaggle platform.
Facebook posts from Nepal collected on September 8 and 9, 2024, as indicated by the title. The dataset is hosted on Kaggle, but the author, organization, and specific content details are unknown. The data likely contains text from posts, potentially related to memes or social discourse.
MFIX simulation files from a study published in Physical Review Fluids (2024) investigate granular rheology. The dataset contains input files and raw particle output files for pressure-controlled shear cell simulations. Simulations are categorized as MONODISPERSE, BIDISPERSE, and TRIDISPERSE, with parameters including confining pressure and plate velocity.
UKCCSRC Call 1 project report from the British Geological Survey reviews chemical tracers for carbon capture and storage monitoring. The report analyzes tracer applications, costs, and environmental impacts for a controlled CO2 release experiment offshore Scotland during 2012. It aims to identify suitable tracers for detecting and quantifying CO2 leakage from subsurface storage.
2,211 national survey responses and 55 in-depth interviews regarding celebrity-mediated nationalism in China. Created by Lingxiao Chen and updated in 2026, it examines how audiences interpret and adapt to official messages embedded in entertainment culture.
Fake reviews splits likely contain text data for training and evaluating models. The dataset is hosted on Kaggle, but its specific origin and creation date are unknown. Columns and sample data are unavailable, limiting immediate assessment of its content and structure.
Kaggle hosts a dataset of content from the social media platform Reddit, specifically flagged as Not Safe For Work (NSFW). The dataset's size, specific collection method, and time range are not detailed in the available metadata. Its author and organization are also unknown.
Curated data from ratings and votes on movies. The dataset's author, organization, and specific scale are unknown. Its last update date is also unspecified.
A collection of Facebook posts, likely from a page or group identified as 'RONB'. The dataset is hosted on Kaggle, but the specific date range, number of posts, and author are unknown. The content appears to be text-based social media posts.
The dataset 'final_podcast_scene' is hosted on Kaggle. Its title suggests it contains data related to podcast episodes, likely focusing on scene-level information. The specific content, volume, and creation details require verification after download.
A list of peer-reviewed literature and the associated coding tree used in a scoping review investigating the intersection of sustainability, arts management, and cultural policy. The review was conducted by Malgorzata Cwikla to inform the development of a future research agenda. The dataset was last updated on March 23, 2026.
A monthly updated catalog of movies and TV shows available on the Disney+ streaming service, collected from Flixable. The dataset includes IMDb ratings, which can provide insights into content popularity and quality. It is shared under a CC0 1.0 license.
Atlantic Ocean data from 2004 to 2016 includes acoustic travel time, bottom pressure, and near-bottom current velocities collected by inverted echo sounders. The dataset is hosted by the National Oceanic and Atmospheric Administration and also appears on NASA EarthData. Columns suggest time-series measurements of physical oceanographic properties.