Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
10,980 datasets
A collection of movie data from The Movie Database (TMDB) platform. The dataset likely contains information on films that have received high user or critic ratings. Specific details on the number of records, included features, and time period covered are unavailable from the provided metadata.
A preprocessed dataset related to depression, likely containing clinical or behavioral measurements. It is hosted on Kaggle, but the author, organization, and specific data collection details are unknown. The exact number of records, features, and temporal coverage are not specified in the available metadata.
A text dataset from Kaggle concerning fake news in the Kyrgyz language. The dataset's author, organization, and specific temporal coverage are unknown. Its size, row count, and column structure are also unspecified.
Hollywood movies are listed with associated review sentiment scores. The dataset is published on Kaggle. The specific number of movies, the source of the reviews, and the date of collection are unknown.
A dataset titled 'SPORTS-priorities-Countries' is hosted on Kaggle. The dataset likely contains survey or ranking data comparing the importance of different sports across various nations. No further metadata is available.
NOAA's dataset contains acoustic roundtrip travel time, pressure, ocean velocity, and temperature recorded by CPIES instruments at five mooring locations. Measurements were collected as part of the Agulhas Current Time-series experiment in the Indian Ocean. The time series spans from April 13, 2010, to February 24, 2013.
Audit reports from the independent Audit Bureau of Circulations provide the historical circulation and subscription prices of U.S. daily newspapers in 1924. The data includes circulation by town and delivery channel for each newspaper. The sample consists of all audited daily newspapers from that year, collected by Michael Sinkinson of Booth University College.
Kaggle hosts a dataset of movie metadata intended for data analysis and machine learning. The specific source, size, and temporal coverage are not detailed in the provided description. Its primary purpose is to support analytical and modeling tasks related to film content.
Over 8,500 movie records are included in this collection. The dataset contains titles, overviews, and release information for the films. It was sourced from Kaggle, but the author, organization, and last update date are unknown.
A dataset about movies, published on Kaggle. The specific contents, size, and creation details are not provided in the available metadata. The dataset's scope and features require verification after download.
Github Codereview is a large-scale dataset containing high-quality human-written code reviews sourced from top GitHub repositories. It captures the interaction between inline reviewer comments on pull requests and the subsequent code modifications made by authors. The dataset is designed to provide a natural signal for training models to understand code quality and the iterative review process.
TMDB Top Rated Movies Dataset 2026 is a collection of movie data from The Movie Database. The dataset likely contains information about films that received high user or critic ratings. It was published on Kaggle, but details about its size, specific columns, and creation date are unknown.
Kaggle hosts a dataset titled 'Quan_tag_review'. The title suggests it likely contains reviews that have been quantitatively tagged. The dataset's author, organization, size, and specific content are unknown.
Phuong_tag_review is a dataset hosted on Kaggle. The title suggests it contains data related to tagging or review processes. No further metadata, such as column descriptions, sample data, or size, is available for verification.
A dataset sourced from Kaggle, likely containing posts or comments from the Reddit platform related to stress. The specific volume, time range, and collection method are not detailed in the available metadata. Its content and structure must be verified after download.
A dataset of product reviews from Vietnamese e-commerce platforms, likely containing examples flagged as suspicious. The dataset is hosted on Kaggle, but its specific size, creation date, and authorship are unknown. Columns and data format are unspecified, requiring verification after download.
A historical analysis by Jeremy Kuzmarov examines U.S. police training programs as a tool of foreign policy and nation-building. The work covers interventions from the early 20th century, including the Philippines and Haiti, through the Cold War and the post-9/11 wars in Iraq and Afghanistan. It argues these programs were used to suppress radical movements and create social control, often resulting in blowback against U.S. interests.
An article by Hazel Rose Markus of Stanford Medicine reviews psychological and cultural research on the meaning of choice. The work contrasts Western, particularly American, perspectives with non-Western and working-class Western views. It examines the relationship between choice, freedom, autonomy, and well-being.
151 quality indicators for blogs and podcasts were identified and refined through a rigorous research process. The resulting Quality Checklists are designed to assist with quality appraisal of medical blogs and podcasts. The dataset was created by Isabelle N Colmers at the University of Alberta and is available under an Open Access (diamond) license.
Emily G. Hervey's study investigates correlations between childhood transition patterns and college adjustment success for Missionary Kids, a subgroup of Third Culture Kids. The research tests hypotheses about the impact of negative transition experiences, interaction with Western peers, and support systems. The dataset likely contains survey or assessment data from the study, which was published on the paperswithcode platform.