Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,028 datasets
Personality-traits-of-Twitter-users-(celebrities) is a dataset from OpenML for finding similarities between public figures based on their Twitter activity. It likely contains personality scores for users across five traits, along with word counts and professional categories. The dataset is published under a CC0 1.0 license.
Google Play Store reviews for the Snapchat Android application. The dataset is intended for sentiment analysis to assess user adoption and perceptions of recent app improvements. It is published under a CC0 1.0 license on the OpenML platform.
Netflix movies data sourced via API for building recommendation systems. The dataset underwent exploratory data analysis, cleaning, and visualization processes. It is licensed under CC0-1.0.
A dataset for classifying facial expressions, published on Kaggle. The specific number of images, collection method, and time range are unknown. The author and organization are also unspecified.
Karen S. Carvalho compiled critical appraisal checklists used to assess risk of bias in studies reviewed for a systematic review. The checklists are based on the tool recommended by The Joanna Briggs Institute for cross-sectional studies. The dataset's size, specific time range, and column details are not provided.
A crawl of Moltbook, a Reddit-style social media platform populated by AI agents built on the OpenClaw framework. The dataset captures the platform's early growth phase, providing a window into AI agent collective behavior. It was created by giordano-dm and last updated on February 9, 2026.
SportsCom is a large-scale, multi-attribute, multi-category, and multi-sport sports commentary dataset for sports video understanding. The dataset was created by Yumq123 and is hosted on Hugging Face, with a last recorded update on 2026, March 17. The code repository for the project is available on GitHub.
A merged collection of fake news datasets sourced from PubHealth, LIAR2, and MultiFC. The dataset is hosted on Kaggle, but its specific size, structure, and creation details are not provided in the available metadata. The content likely contains text articles or statements with associated veracity labels.
A historical text collection analyzing the mutual influence between German and American cultures from 1800 to 2000. The dataset was authored by Elliott Shore and is hosted on the paperswithcode platform. It is structured into three parts covering German aspects of American history, American aspects of German history, and modern transatlantic relations.
Norman A. Graebner's work examines the task of diplomacy in achieving communication among nations with divergent cultures. This volume is the first in a trilogy focusing on American diplomacy from the post-Civil War era through World War II. The content likely contains analysis of diplomatic principles, historical events, and international relations.
A historical text analysis of British-American financial and purchasing relations from 1914 to 1918. The work, authored by Kathleen Burk, examines topics such as neutrality, munitions procurement, and Treasury missions. The dataset likely contains structured chapters detailing specific missions and financial crises.
The 1919-1933 period is examined in a historical text by Frank Costigliola, analyzing U.S. political, economic, and cultural relations with Europe after the First World War. The work explores themes such as peace treaty revisions, Western economic recovery, and modernization. It discusses American power through the lens of European fascination with U.S. technology, trade, and culture.
A dataset titled 'review-chekpoints--2026-05-04--13243-13243' was published on Kaggle. The title suggests it may contain data related to checkpoints for reviewing or evaluating models. Metadata is minimal; the actual content, scale, and purpose require verification after download.
PRESSURE - WATER data originates from the NOAA National Centers for Environmental Information under accession number 8600075. The dataset likely contains measurements of water pressure and related oceanographic variables. Its specific temporal and spatial coverage, column details, and volume are not described in the available metadata.
A dataset from the World Health Organization (WHO) containing scores related to compliance with bans on the appearance of tobacco brands in television and films. The specific metrics, geographic coverage, and time period are not detailed in the available metadata. The data likely provides quantitative measures for monitoring policy adherence in media.
Global data on the percentage of people living with HIV who are receiving treatment and have achieved suppressed viral loads. The dataset is published by the World Health Organization (WHO) via the GHO platform, focusing on a key indicator for monitoring HIV program effectiveness. The specific temporal coverage and geographic scope are not detailed in the available metadata.
WHO Global Health Observatory data on the percentage of people living with HIV who have suppressed viral loads. The dataset likely contains country-level or regional indicators for monitoring treatment success. It is published by the World Health Organization.
A dataset from the World Health Organization (WHO) containing scores measuring compliance with bans on the appearance of tobacco products in television and/or films. The data likely contains quantitative or categorical assessments of media content against specific regulatory standards. The exact number of records, time coverage, and specific scoring methodology are not detailed in the available metadata.
Retreatment cases for pulmonary tuberculosis where patients were smear and/or culture positive after defaulting on their initial treatment regimen. The dataset originates from the World Health Organization's Global Health Observatory, indicating a focus on global health monitoring. Specific details on the number of records, time period, and contributing countries are not provided in the available metadata.
Retreatment cases: treatment after failure (pulmonary smear and/or culture positive) is a dataset from the World Health Organization (WHO) Global Health Observatory (GHO). It likely contains records of tuberculosis patients requiring a second course of treatment after an initial failure. The specific number of rows, columns, and temporal coverage is unknown from the provided metadata.