Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,992 datasets
PRWire News Pickup Predictor Benchmarks likely contains data for predicting the dissemination of press releases. The dataset is hosted on Kaggle, but its specific size, authorship, and update history are unknown. Columns and data types must be inferred from the title and verified after download.
A dataset of top-rated movies, sourced from Kaggle. The specific time range, number of entries, and original author are unknown. The dataset likely contains information related to movie titles and user or critic ratings.
Kaggle hosts a dataset titled 'movies_recommendation'. The dataset likely contains information for building or evaluating movie recommendation systems. Details on its size, columns, and creation date are not provided in the available metadata.
The San Francisco Board of Supervisors has recognized several cultural districts distinguished by unique social and historical associations and living traditions. The dataset, provided by the City of San Francisco, describes these districts which are defined by activities like commerce, services, arts, events, and social practices within their physical boundaries. It was last updated on March 22, 2026.
Historic Context Statements provide background history of particular neighborhoods, themes, or cultures in San Francisco and tie this history to sites within the built environment. The statements are reviewed and adopted by the city's Historic Preservation Commission. The data was last updated on March 22, 2026, and is provided by the City of San Francisco.
AgentReviewChat contains 278,790 inline code review conversations from 54,330 pull requests across 300 popular open-source GitHub repositories. It captures interactions between human developers and 16 identified AI agents, enabling analysis of review feedback quality and interaction dynamics. The dataset was created by Suzhen and is associated with a research paper.
A geospatial dataset showing cable television and broadband service availability for locations in Washington DC. The data was provided by cable providers and is aggregated by the District of Columbia's Office of Cable Television, Film, Music & Entertainment. The dataset was last updated on March 11, 2026.
Simona Bisiani published this replication package in 2026 to support an empirical assessment of local news performance and resource sharing at Reach plc. The repository provides the specific code and model weights used to analyze regional newspaper hubs in the associated journal article.
Complete transcripts from every episode of the Changelog News podcast, generated from a linked GitHub repository. The dataset was uploaded by willtheorangeguy and last updated on 2026-04-17.
Forty-three coded studies and 12,104 screening records document a scoping review on Usage-Based Theory (UBT) and Gestalt Language Processing (GLP). Created by Alice Silvestre Campião and hosted on Harvard Dataverse, the collection tracks language development research in children with typical development or autism as of 2026.
A dataset titled 'Top_rated_movies-dataset' is hosted on the Kaggle platform. The dataset likely contains information about movies that have received high ratings. Metadata such as columns, size, license, and author are currently unknown.
Netflix Movies and TV Shows catalog data, published on Kaggle. The dataset likely contains listings of titles available on the streaming platform. Specific details on the number of records, columns, and update frequency are not provided in the available metadata.
The Mars Express MARSIS Active Ionospheric Sounder (AIS) full resolution data set includes all spectral information calibrated in units of spectral density for the Mars Express third extended mission, EXT3. The data set consists of a transmit frequency followed by a time series of spectral density measurements of the received power. Browse products contain a spectrogram overview plot and individual ionograms for each sounding activity.
Replication data from the American Review of Public Administration article 'Personality Traits and Agency Politicization: Neuroticism and Perceived Political Control in United States Federal Agencies'. The dataset was created by author Gary Hollibaugh and was last updated on the Harvard Dataverse platform in April 2026.
Review checkpoints likely contain evaluation metrics or states for machine learning models. The dataset is published on Kaggle, a platform for data science competitions and projects. The specific content, size, and creation details require verification after download.
Laboratory data from Kodiak, Alaska tracks survival, growth, and morphology of juvenile snow crabs exposed to three pH levels. The dataset is associated with a NOAA study under review, focusing on the effects of ocean acidification on a federally managed species. Observations were recorded from April 2021 to June 2022.
The Mars Express MARSIS Active Ionospheric Sounder (AIS) full resolution data set includes all spectral information calibrated in units of spectral density for the entire Mars Express nominal mission. The data set consists of a transmit frequency followed by a time series of spectral density measurements of the received power. It was produced by the National Aeronautics and Space Administration and last updated in March 2026.
MARS Express MARSIS Active Ionospheric Sounder data includes full-resolution spectral information calibrated in spectral density units. The dataset covers the seventh extended mission of the Mars Express spacecraft, providing spectrogram overview plots and individual ionograms for each sounding activity. It was produced by the National Aeronautics and Space Administration and last updated in March 2026.
A dataset of 1503 records collected from a medical hospital via a Google form questionnaire. The data includes 15 attributes, with 10 selected for this release, including 9 features and 1 target attribute. The target attribute 'Feeling Anxious' was chosen as a predictor for postpartum depression.
World Ocean data from Air-Launched Autonomous Micro Observer (ALAMO) profiling floats, which measure temperature, salinity, and pressure. The dataset was developed by NOAA NCEI for deployment in challenging environments like tropical cyclones and around sea ice. It contains measurements from July 15, 2014, to November 11, 2018.