Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,035 datasets
Twitter_Data_30K is a dataset of social media posts sourced from the Twitter platform. The dataset likely contains 30,000 text entries, though the specific content, time range, and collection method are not detailed in the provided metadata. It was published on Kaggle, but the author, organization, and license information are unknown.
MN_DS News is a dataset published on Kaggle. Its title suggests it contains news-related content, likely text articles. The dataset's specific scope, size, and collection details are not provided in the available metadata.
Hotel reviews from a leading travel site, containing user-provided text and metadata. The dataset includes columns for a unique User_ID, the review Description, Browser_Used, Device_Used, and a target variable Is_Response. It is published under a CC0-1.0 license on the OpenML platform.
A dataset of car reviews from Edmunds. No information is available regarding its size, features, or creation details.
A collection of educational materials covering 15 core topics in sports and entertainment marketing. The content likely includes chapters on marketing-information management, promotional planning, legal issues, and career paths. Its origin and specific data format are not detailed in the provided metadata.
TikTok dataset published on Kaggle. The dataset's specific content, size, and creation details are unknown. Metadata is minimal; actual content requires verification after download.
ReviewBooks2018 is a dataset of book reviews, likely collected in 2018. The dataset is hosted on Kaggle, but its specific source and collection method are not detailed. The total number of reviews and the specific features included are unknown.
CMU Movie Summary is a dataset of movie plot summaries and associated metadata, likely sourced from Carnegie Mellon University. The dataset is published on the Kaggle platform. The specific number of records, columns, and creation date are unknown from the provided input.
Parametric Batik Creative Design Dataset contains data on cultural motifs and design attributes. It was published on Kaggle, though the specific author, organization, and data volume are not provided. The dataset's last update date is unknown.
A collection of traditional motifs and design patterns, published on Kaggle. The dataset likely contains visual assets for cultural and creative product design. Specific details on the number of items, source, and creation date are not provided in the available metadata.
1820-1890 newspaper articles analyzing representations of Native American identity, compiled by John M. Coward of Rutgers University. The dataset likely contains textual excerpts or articles from historical U.S. newspapers. Its specific size, format, and column structure are unknown.
Charles E. Clark authored this dataset, which is hosted on Papers with Code. It likely contains historical data related to newspapers in Anglo-American culture. The specific volume, format, and internal structure of the data are not detailed in the provided metadata.
Kaggle hosts a dataset titled 'review-chekpoints--2026-05-01--13240-13240'. The title suggests it may contain review text data, possibly organized by checkpoints. The dataset's author, organization, and specific content details are not provided in the available metadata.
PureDocBench v2 Reviewer Full is a dataset published on Kaggle. The dataset likely contains documents and associated reviewer assessments, suggesting a focus on document quality evaluation. Its specific content, size, and creation details require verification after download.
Fake news detection preprocessed dataset published on Kaggle. The dataset likely contains text data that has been prepared for machine learning tasks. Metadata is minimal; the specific source, size, and preprocessing steps require verification after download.
Amazon_review_sentiment likely contains text data from the Amazon marketplace for analyzing customer opinions. The dataset is hosted on Kaggle, a platform for data science competitions and projects. Specific details on volume, time range, and annotation methodology are not provided in the minimal metadata.
DatasetCrawlTiktok is a dataset sourced from the Kaggle platform. The title suggests it contains data crawled or collected from the social media platform TikTok. The dataset's specific content, size, and collection methodology are not detailed in the available metadata.
MedEvoEval is a dataset of anonymous reviews, likely related to medical evaluations, published on Kaggle. The dataset's specific content, size, and creation details are not provided in the available metadata. Further verification is required to confirm the exact nature and scope of the data.
ReviewToys2018 likely contains consumer-generated text reviews for toys from the year 2018. Published on Kaggle, the dataset's specific size, author, and collection method are unknown. Its title suggests a focus on product feedback within the toy industry for that specific year.
Indian social media posts discussing depression, intended for sentiment and social footprint analysis. The dataset is hosted on Kaggle, but its author, size, and specific collection period are unknown. Its license and last update date are also unspecified.