Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,012 datasets
Hand-curated data on mobile phones includes specifications, user reviews, and revenue information for the period 2019 to 2020. The dataset was sourced from Kaggle, but the original author and organization are unknown. The total number of rows and specific file formats are not provided.
Replication files for a forthcoming study in the Review of Economics and Statistics, authored by Nicolas Robert Ziebarth. The study uses three German public datasets (DRV, SOEP, SAVE) to analyze fundamental reforms to the Disability Insurance system. The raw data is not included due to Data Use Agreements, but the provided files support replication of the analysis.
Approximately 360,000 unique patient reviews scraped from WebMD, updated until March 2020. The data includes user ratings, comments on specific drugs, related conditions, side effects, and demographic information like age and sex. It is intended for analyzing patient satisfaction and predicting drug ratings.
All wine reviews from winemag.com for the years 2017 through 2020. The data was updated via a scraper, with duplicates cleared and vintage information parsed from the title. NV in the data indicates non-vintage wines.
Imdb-data is a dataset for various movies gathered together. The description indicates it contains movie release year, description, and revenue information. The dataset is shared under a CC0-1.0 license on the OpenML platform.
A collection of tweets scraped from the Twitter account @dril in January 2020. The dataset includes the date, text, and engagement statistics such as likes and retweets for each post. It was created using the GetOldTweets scraper and is shared under a CC0 1.0 license.
A dataset from the OpenML platform with the identifier 'colleges_usnews'. No information is available on its contents, size, or structure.
Newset is a dataset published on the Kaggle platform. The dataset's specific content, size, and structure are not described in the provided metadata. Further details about the data's origin, collection method, and temporal coverage are unavailable.
A data-driven journey through anime culture and entertainment. The dataset is hosted on Kaggle, but its author, organization, and specific creation date are unknown. Its size, row count, and column-level details are not specified.
Kaggle hosts a dataset titled 'fake_news_bayes'. The dataset likely contains text data for analyzing misinformation, as suggested by its title. Its author, organization, and specific collection details are unknown.
Review Shoppe is a dataset of product reviews published on Kaggle. The dataset's specific content, size, and origin are not detailed in the available metadata. Its actual scope and quality require verification after download.
Fake News Detection is a dataset hosted on Kaggle for training models to identify misinformation. The dataset likely contains text articles or social media posts labeled for veracity. Metadata is minimal; actual content requires verification after download.
Yale Richmond describes the history of cultural exchanges between the United States and the Soviet Union from 1958 to 1986. The description covers areas such as the performing arts, popular media, academia, public diplomacy, and science and technology. The data relates to the U.S.-USSR Cultural Agreement signed at the Geneva summit in 1985.
Sir Percy Cradock's personal account details his experiences as a British diplomat in China from 1962 to 1992. The text chronicles major historical events, including the Cultural Revolution, Deng Xiaoping's reforms, and the negotiations over Hong Kong. It is presented as a memoir offering an insider's perspective on Sino-British dealings during a turbulent period.
The dataset likely contains textual and historical analysis related to the U.S. Army newspaper 'Neue Zeitung', published for the German population from 1945 to 1955. It examines the newspaper's role in cultural transmission, propaganda, and Cold War dynamics, focusing on editor biographies and conflicts with U.S. authorities. The data is sourced from a scholarly chapter in the edited volume 'Cold War Constructions'.
A historical study analyzing American cultural influence in East and West Germany during the early Cold War. The description indicates it is based on an array of sources including films, newspapers, sociological studies, and archival materials from Germany and the U.S. It examines how authorities used gender and racial norms to contain youth cultures and wage ideological battle.
This football analytics dataset, updated in March 2026 by dcaribou, contains structured data on professional players, clubs, and transfers extracted from Transfermarkt. It employs dbt for data transformation to provide analytics-ready tables for soccer-related research and modeling.
review-chekpoints--2026-05-18--13257-13257 is a dataset hosted on Kaggle. Its title suggests it likely contains evaluation data or metrics for machine learning model checkpoints. The specific content and structure require verification after download.
Survey data measures maladaptive daydreaming, depression, anxiety, and stress levels within a study population. The dataset was contributed by author Maryam Falih and hosted by Harvard Dataverse. It was last updated in March 2026.
A dataset about movies, sourced from Kaggle. The specific contents, size, and creation details are unknown from the provided metadata.