Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,018 datasets
A collection of the official transcript of President Trump's speech from a joint press conference with Prime Minister Netanyahu on February 2, 2025. It includes verified audio recordings, official written records, and contextual annotations curated for academic analysis.
Twitter Prince12575 2026.02.28 2027576258188742902 Mhuxg7Taqs6Xdmmh Part1 is an image dataset published on HuggingFace by user daaxila. The platform tags indicate it contains images, likely sourced from social media. The dataset was last updated on March 31, 2026.
10,000 highest-rated films worldwide are ranked by user votes. The dataset appears to be updated with information through the year 2026. It is hosted on Kaggle, but the original author and specific collection methodology are not detailed.
Replication data from a study on social network evolution analyzes follower counts and timings for retweeted accounts and attention brokers. The dataset supports causal analysis of how content amplification drives triad transitivity in directed social networks. It contains hashed user identifiers for privacy protection.
Kaggle hosts this dataset titled 'review-chekpoints--2026-05-12--13251-13251'. The title suggests it relates to checkpoints, likely for model review or evaluation in machine learning. Its specific content, size, and origin are not detailed in the provided metadata.
From 2006 to early 2022, this collection contains at least 350,000 tweets referencing or remixing the iconic lines from T.S. Eliot's poem 'The Hollow Men'. The dataset was compiled by researcher Melanie Walsh to study the political usage of the phrase, particularly in discussions of democracy and culture wars, as it evolved into a linguistic snowclone.
A dataset from Kaggle likely containing information for classifying movies by genre. The specific number of records, features, and creation details are unknown. Metadata is minimal; the actual content and structure require verification after download.
Culture_emotion_AI is a dataset hosted on Kaggle, focusing on the intersection of cultural context and emotional expression. Its specific content, size, and authorship are not detailed in the available metadata. Users must download the dataset to verify its exact composition and potential for training or evaluating AI models.
A dataset for building and evaluating recommendation systems, sourced from Kaggle. The specific content, such as user-item interactions or product metadata, must be verified after download. Metadata is minimal; the exact number of records, features, and data collection methodology are unknown.
A metadata-only dataset derived from GDELT Article List records. The data focuses on news coverage of a hypothetical US-Iran war during a four-month period in early 2026. The dataset's author, organization, and specific scale are unknown.
Featuring between 10,000 and 100,000 Russian-language popular science articles from the N + 1 media outlet, curated by rustemgareev. Each record includes the article text metadata, topical categorization, and a publisher-assigned difficulty rating. The collection was last updated in March 2026.
10,971 French movie reviews from the Allociné corpus annotated with fine-grained emotion labels by romainm4 in February 2026. The dataset utilizes Claude Haiku 4.5 to assign seven primary emotion classes, nuanced sub-emotions, and confidence scores to each review.
Customer reviews likely contain textual feedback and ratings. The dataset is hosted on Kaggle, but its specific content, size, and origin are unconfirmed. Metadata is minimal; actual data requires verification after download.
Twitter Ptf4567 is a dataset of social media images uploaded by user 'daaxila' to the Hugging Face platform. The dataset appears to contain images, as suggested by platform tags, and was last updated on March 31, 2026. The specific content, volume, and collection method are not detailed in the available metadata.
Movie V36 is a dataset uploaded to Hugging Face by Yen0606. The dataset was last updated on April 8, 2026. Its specific content and scale are not detailed in the available metadata.
A dataset about movies, likely containing information on titles, genres, ratings, or other film-related attributes. It is hosted on the Kaggle platform. The specific origin, size, and collection date are unknown.
Test movie Data is a dataset hosted on the Kaggle platform. The dataset's specific contents, such as the number of rows, columns, and features, are not described in the available metadata. Its origin, creation date, and detailed scope require verification after download.
Kaggle hosts a dataset titled 'Review Tokopedia'. The data likely contains user-generated reviews from the Indonesian e-commerce platform Tokopedia. The author, organization, and specific data volume are unknown.
Helpfulness review data published on Kaggle. The dataset likely contains user-generated reviews or feedback with associated helpfulness ratings. Specific details on volume, authorship, and timeliness are unavailable from the provided metadata.
1895 to 1937 historical-empirical inquiry into Japan's financial imperialism and 'yen diplomacy'. The monograph analyzes monetary reforms and lending schemes in Taiwan, Korea, China, and Manchuria, focusing on capital shortage paradoxes and policy-making polarities. Authored by Michael Schiltz, the project was funded by the European Research Council under grant agreement 240854.