Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,019 datasets
A dataset of movie ratings, likely sourced from a user community. The dataset is hosted on Kaggle, a platform for data science and machine learning projects. Specific details such as the number of records, time period, and contributing organization are not provided.
Kaggle hosts the Cultural Games Benchmark dataset. The dataset likely contains performance metrics or tasks related to games from various cultures. Its specific contents, such as column names and data volume, require verification after download.
Four million Indonesian-language reviews for the TikTok mobile application, collected from the Google Play Store between 2017 and 2026. The dataset is intended for natural language processing and sentiment analysis tasks. Its author, organization, and specific license are not provided in the input metadata.
TRNEWS-2025 is a dataset of Turkish news articles published on Kaggle. The dataset's specific size, source, and collection period are not detailed in the available metadata. Its content and structure require verification after download.
A text dataset published on Hugging Face by author dev1601, with a last recorded update in April 2026. The dataset's title and platform tags suggest it contains content generated by or for creative artificial intelligence processes. The specific volume, source, and detailed structure of the data are not provided in the available metadata.
Kaggle hosts a dataset titled 'enjoysports'. The dataset's specific content, size, and creation details are not provided in the available metadata. Its title suggests it relates to sports or recreational activities.
A synthetic dataset simulating user interactions on a Thai-language e-book platform, intended for building and benchmarking recommendation algorithms. The dataset's creator, size, and specific temporal coverage are not provided. It is hosted on the Kaggle platform.
1563 anime titles with associated metadata were collected from Anime News Network and MyAnimeList. The dataset includes features such as title, type, and number of episodes. It is shared under a CC-BY-NC-SA-4.0 license.
January to May 2025 saw the collection of 2,000 publicly available short-form videos from TikTok and YouTube. The dataset focuses on content from the Philippines and is intended for analyzing video categories. The author and license information are not provided.
A dataset from Kaggle concerning the performance of Marvel movies. The title suggests it contains rankings or metrics for films released up to week 16 of 2024. The specific columns, data volume, and collection methodology are unknown.
SeyhaLite curated and cleaned this dataset to support high-quality Khmer Language Models and translation systems. The data provides information about entertainment, media, and arts. The dataset page was last updated on February 13, 2026.
IMDB_movie_1972-2019 contains information for 5,834 movies scraped from the IMDB website. The data was preprocessed and cleaned for machine learning applications, such as building a recommendation model. It is shared under a CC0 1.0 license.
review-chekpoints--2026-05-10--13249-13249 is a dataset published on Kaggle. The title suggests it likely contains review text data, possibly with associated checkpoints or labels. Metadata is minimal; actual content requires verification after download.
IMDb filmography data aggregated on Kaggle. The dataset likely contains records for movies and television shows, including titles, cast, crew, and ratings. Its specific size, columns, and update date are not provided in the available metadata.
Wikipedia Movies Dataset 2016-2026 contains 10 years of movie metadata from Wikipedia. The data likely includes titles, descriptions, and release dates. The dataset's author, organization, and exact size are unknown.
Monthly Tweetreach reports track the online reach and activity of the Mayor of London's 'Ask Boris' Twitter sessions. Each report includes metrics like unique reach, total impressions, and tweet volume for sessions monitored via the hashtag #askboris. The data is produced by the Greater London Authority, with the last recorded metadata update in March 2026.
This text-based work examines Protestant far-right opposition to internationalism in the United States from the Great Depression through the Cold War. It analyzes theological perspectives from denominations including Dispensationalists, Calvinists, and Lutherans, and their political engagement concerning bodies like the League of Nations. The source is a scholarly monograph from the paperswithcode platform.
An article analyzing processes for positive culture change in closed environments, drawing on organizational theory and expert interviews. The work is sourced from the Association for the Prevention of Torture (APT) and existing research in the field. The temporal coverage and specific data volume are not provided.
Nigeria's Nollywood video film industry is the subject of this collection of essays and analyses. The content likely includes historical perspectives, market statistics, and discussions on censorship, distribution, and cultural impact across Africa. The source is paperswithcode, but the original author, organization, and specific data format are unknown.
Medical_chinese_news is a dataset of news articles in Chinese, published on Kaggle. The dataset's content likely pertains to medical and health topics, though specific details such as article count, source, and time period are not provided. Further verification after download is required to confirm its scope and structure.