Loading...
Loading...
News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation
11,001 datasets
Review checkpoints likely contain text data for analysis, such as user feedback or product evaluations. The dataset is hosted on Kaggle, a platform for data science and machine learning projects. Specific details about the data volume, source, and creation date are not provided in the available metadata.
Movie information sourced from the IMDB platform. The data was taken and merged in October 2025. The dataset is hosted on Kaggle, but specific details about its size, structure, and contents are not provided in the metadata.
E-Commerce Product & Web-Scraped Datasets is a collection of product feeds, pricing metrics, and consumer review data aggregated from multiple domains. The dataset appears to be sourced from web scraping activities, though the specific sources and collection methodology are not detailed. The number of records, file formats, and last update date are unknown.
A dataset of restaurant food reviews with sentiment labels, sourced from Kaggle. The dataset likely contains text reviews and corresponding sentiment classifications. Metadata such as row count, column details, and creation date are not provided in the input.
TMDB Movies Dataset originates from Kaggle, a platform for data science competitions. Its content likely relates to movies, based on the title, but specific details like size, columns, and creation date are unconfirmed. Users should verify the actual data content after download.
Kaggle hosts a dataset titled 'nature and culture', though its specific contents are not detailed. The title suggests it may contain information linking environmental and societal factors. Metadata is minimal; actual data quality and scope require verification after download.
ProtoSSMK3vecMultiSeedFilmGated is a dataset published on Kaggle. The title suggests it may contain prototype or experimental data related to multi-seed or film-gated processes, likely for machine learning. The dataset's specific content, size, and origin are not detailed in the provided metadata.
A 2019 dataset from Politecnico di Milano includes news articles from US major news outlets and associated Twitter sharing activities. It covers tweet content and user details, conceived to study news dissemination behaviors. The dataset was used for political stance classification and is described in the ICWSM'19 paper by Brena et al.
A systematic review of fibromyalgia research literature from 2020. The dataset, authored by Alessandra Alciati of Humanitas University, compiles all articles on fibromyalgia indexed in PubMed between 1 January 2020 and 31 December 2020. The review focuses on diagnostic, pathogenetic, and therapeutic aspects of the syndrome.
Ultraviolet images of Earth's aurora captured by the Dynamics Explorer 1 satellite's Spin-Scan Auroral Imager. The data includes quasi-logarithmically compressed intensity counts and spacecraft position, velocity, and spin axis vectors. NASA provides this data in CDF format via CDAWeb.
An AI-powered skincare reviews and recommendation dataset. The dataset's author, organization, and specific scale are unknown. It was sourced from the Kaggle platform, but its last update date is not provided.
A collection of movie reviews from critics, sourced from the Rotten Tomatoes platform. The dataset is hosted on Kaggle, but its specific size, date range, and collection methodology are not detailed in the provided metadata. The content likely contains textual reviews and associated critic ratings.
Kaggle hosts a dataset tracking digital behavior and mental health indicators. The data likely contains measures related to anxiety, depression, and dopamine addiction. The author, organization, and specific data volume are unknown.
Bangla language text data related to fake news detection. The dataset is published on Kaggle, but its size, creation date, and author are unknown. Metadata is minimal; actual content requires verification after download.
A dataset listing top-rated movies, sourced from the Kaggle platform. The specific number of movies, rating criteria, and data collection method are not detailed in the available metadata. The dataset's content and structure require verification after download.
Kurdish movies datasetss is a dataset hosted on Kaggle. The dataset likely contains information about films from Kurdish cinema. Metadata is minimal; actual content and scale require verification after download.
Replication code published in 2026 supports a study analyzing reproductive outcomes and son preference. The package contains Stata do-files for reproducing tables and figures from a Review of Development Economics article. It was authored by Khilola Dushamova and hosted by Harvard Dataverse.
The SESAME (Satellite Experiments Simultaneous with Antarctic Measurements) dataset contains key parameters from a ground-based experiment. It provides omni-directional intensities in decibels (dB) measured in two narrow passband filters centered on 1kHz and 3kHz. The data was produced by NASA and documented in a 1995 Space Science Reviews publication.
MovieLens data contains over 32 million movie ratings and tagging activities contributed by users since 1995. The dataset was uploaded to Kaggle in October 2023. The specific author, organization, and license details are not provided in the available metadata.
Greater London Authority ad hoc housing analysis results, including figures referenced in Mayoral press releases. The data is produced by the Greater London Authority and was last updated on March 25, 2026. The specific scope, time range, and granularity of the underlying analysis are not detailed in the provided metadata.