Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,999 datasets
Cornell Review Data likely contains a collection of movie reviews from the Rotten Tomatoes platform. The dataset is hosted on Kaggle, but its specific size, creation date, and authorship are unknown. Columns and sample data are unavailable, limiting immediate assessment of its content.
A dataset of wine reviews from the OpenML platform. No information is available on the number of rows, columns, or specific content.
A dataset for building movie recommendation systems and predicting user ratings. It originates from the IMDB platform and is hosted on Kaggle. The specific number of records, features, and update history are not detailed in the available metadata.
BBC News articles published on Kaggle. The dataset likely contains text from news stories, but the exact number of articles, publication dates, and specific content are unknown. The original author and organization are not specified.
Kaggle hosts a dataset titled 'MOvie recommendation set 1'. The dataset likely contains data for building or testing movie recommendation systems. Its author, organization, and specific details like row count and column descriptions are unknown.
Financial news articles aggregated from unspecified sources. The dataset is hosted on Kaggle, but its size, time range, and specific sources are unknown. The original author and organization are also not provided.
Movies data published on the Kaggle platform. The dataset's specific content, size, and origin are not detailed in the provided metadata. Further details such as the author, license, and last update date are unknown.
Top Rated Us Movies is a dataset hosted on Kaggle. The dataset likely contains information about movies produced in the United States that have received high ratings. Specific details such as the number of records, included features, and the source of the ratings are not provided in the available metadata.
A feminist podcast's episode data examines movies through the Bechdel Test and a proprietary 'Nipple' rating scale. The dataset likely contains movie names, guest information, air dates, genres, Bechdel Test results, and host/guest ratings. It was created from the podcast 'The Bechdel Cast' and is shared under a CC0-1.0 license.
10,000 movie records fetched from the TMDb API, which is a crowd-sourced database used by many film-related applications. The dataset likely contains popularity metrics and other film information, though some fields may have null values due to missing data in the source. It is published under a CC0-1.0 license.
CC-News is a dataset of news articles, likely collected from various online sources. The dataset is published on Kaggle, but specific details about its size, collection period, and creator are not provided in the available metadata. Its content and structure require verification after download.
Compressed PCS Dataset is a dataset published on Kaggle. The title suggests it may contain data related to compressed forms of PCS, which could refer to Personal Communication Services, Phase Change Materials, or another domain-specific acronym. Metadata is minimal; actual content, scale, and authorship require verification after download.
Rikio Inouye produced this replication dataset for International Studies Quarterly to document US and Chinese COVID-19 vaccine distribution patterns between 2021 and 2022. It combines regression data with qualitative insights from elite interviews to categorize aid strategies into four distinct types: preserving, pressuring, protecting, and peeling. The data specifically examines how great power rivalry influenced health aid allocation to countries like Paraguay and Nicaragua.
Replication materials for the main analysis in the manuscript titled 'Haunted by a Bad Review: The Afterlife of Reading (Negative) Reviews in Consumer Experience'. The data was contributed by Dena Yadin to the Harvard Dataverse and was last updated on April 9, 2026.
Kaggle dataset titled 'review-chekpoints--2026-05-28--13267-13267'. The title suggests it may contain checkpoint or evaluation data related to reviews, likely for machine learning models. The dataset's content and structure require verification after download.
A dataset of public Instagram profile metadata and engagement analytics, inspired by the account of cricketer Virat Kohli. The dataset was sourced from Kaggle, but the author, organization, and specific collection details are unknown. The last update date and data volume are also unspecified.
A dataset related to Netflix recommendation systems, published on Kaggle. The specific content, size, and features require verification after download. Metadata such as column descriptions, row count, and license are currently unknown.
Top_rated_movies is a dataset published on Kaggle. The dataset's title suggests it contains information about films that have received high audience or critic scores. Metadata is minimal; actual content requires verification after download.
Primary source journals from American merchant ships engaged in trade with India between 1784 and 1860. The data, compiled by Robert R. Swartout, includes logs from vessels like the Ruby, Belisarius, Derby, Tartar, and Apthorp. These documents capture commercial and cultural encounters during the Age of Sail.
Originally published in 1981, this work analyzes the use of presidential power to influence public opinion in foreign affairs under Presidents McKinley, Theodore Roosevelt, Taft, and Wilson. The author, Gaines M. Foster, argues that the executive branch held significant freedom in foreign policy decision-making during this period. It is presented as an unaltered reprint from the University of North Carolina Press.