DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

11,037 datasets

Replication Data for Quantitative Trade and Spatial Models with Measurement Error Analysis

Replication data for a forthcoming article in the Review of Economics and Statistics, authored by Bas Sanders. The dataset likely contains variables used to analyze measurement error and counterfactual scenarios in quantitative trade and spatial economic models. It was last updated on March 18, 2026.

TabularCounterfactualsSocial SciencesMeasurement ErrorEconomicsTrade ModelsSpatial Models+1

0 views

Media & Communication

DopplerWild-Preview: Wildlife Audio Recordings

DopplerWild-Preview is a dataset hosted on Kaggle. The title suggests it likely contains audio recordings related to wildlife, possibly using Doppler-based sensing. The dataset's specific content, size, and origin are not detailed in the available metadata.

AudioBioacousticsDopplerWildlife+1

0 views

Media & Communication

Restaurant Reviews for Sentiment Analysis

Restaurant_Reviews is a dataset hosted on Kaggle. The dataset likely contains textual feedback from customers, potentially with associated ratings or labels. Its specific size, origin, and update history are not detailed in the available metadata.

TextTabularRestaurant ReviewsSentiment AnalysisCustomer Feedback+1

0 views

Media & Communication

Politician and Celebrity Tweets from India and the United States

Over 42,600 public figure accounts from India and the United States are represented in this collection of tweets. The dataset includes politicians, celebrities, news media, and influencers, compiled by author Anmol Panda. It provides text data for analyzing political and social discourse across two major democracies.

TwitterPoliticiansSocial MediaCelebrities+1

0 views

Media & Communication

Fictional Couples Database from Film, TV, Anime, and Literature

A collection of over 600 romantic and platonic couples sourced from fictional media. The dataset includes characters from Film, Television, Anime, and Literature. Its specific author, license, and update history are not provided in the metadata.

TabularMedia AnalysisFictional CouplesPop CultureRelationship Dynamics+1

0 views

Media & Communication

Depression Dataset from Kaggle

A dataset titled 'depression_dataset' published on the Kaggle platform. The dataset's specific content, size, and origin are not detailed in the provided metadata. Its title suggests it contains information related to depression, likely for analysis or modeling purposes.

TabularMental HealthDepression+1

0 views

Media & Communication

IMDb Multi-Movie Review Dataset with 114,000 User Ratings

Approximately 114,000 user reviews collected from over 150 movies on IMDb. Each movie's reviews are stored in a separate JSON file identified by its IMDb ID. The dataset was created by chaziee and last updated on 2026-01-30.

TextSource DatasetsoriginalLanguageenUser Generated ContentSize Categories100 Kn1 MLicensecc By Sa 40Sentiment AnalysisAnnotations Creatorscrowd SourcedText ClassificationMovie ReviewsRegionusTask Categoriestext ClassificationMultilingualitymonolingualTask Idssentiment Classification+1

0 views

Media & Communication

VQS-4k Random Sample: NeurIPS Review Data

VQS-4k Random Sample is a dataset posted on Kaggle. The title suggests it is a random sample of data related to NeurIPS conference reviews. The dataset's specific content, size, and structure require verification after download.

TabularNeuripsMachine LearningReviewResearchSample+1

0 views

Media & Communication

PureDocBench v2 Reviewer Sample: A Benchmark for Document Quality

PureDocBench v2 Reviewer Sample is a dataset published on Kaggle. The title suggests it is a sample from a benchmark designed for evaluating document quality, likely containing text data for assessment tasks. Metadata is minimal; actual content requires verification after download.

TextBenchmarkingAi AssessmentDocument Evaluation+1

0 views

Media & Communication

Fakenewsnet: A Collection of News Articles for Misinformation Analysis

A dataset named 'fakenewsnetPython' is hosted on Kaggle. Its title suggests it likely contains news articles or related metadata for the study of misinformation. The dataset's specific content, size, and origin require verification after download.

TextText AnalysisFake NewsMedia ContentPython+1

0 views

Media & Communication

Calderdale Library Reviews with Visits and Costs 2014-2017

Libraries review data covers three fiscal years from 2014/15 to 2016/17 for each library in Calderdale. It includes metrics on visits, building costs, and running costs, compiled by the Calderdale Metropolitan Borough Council.

LibrariesLibrary+1

0 views

Media & Communication

CineGraph: Movies, TV Shows, and People from TMDB

A collection of entertainment industry data from The Movie Database (TMDB). The dataset includes over 22,000 movies, 16,000 TV shows, 58,000 people, and 25,000 reviews. The original author, organization, and license are unknown.

TabularReviewsMoviesTv ShowsEntertainment+1

0 views

Media & Communication

Water Pressure and Ocean Data from New Horizon Ship, October 1988

Data collected by the NOAA ship New Horizon in October 1988 likely contains measurements of water pressure and other oceanographic properties. The dataset's columns suggest it is a time-series of in-situ observations. Metadata is minimal; actual content requires verification after download.

TabularTime SeriesOceanographyPressureShip Based DataWater Properties+1

0 views

Media & Communication

Movie Metadata for Data Cleaning and Exploration Practice

Dummy Movies Dataset For Practice is a collection of movie metadata intended for data cleaning and exploration practice. The dataset is hosted on Kaggle, but its author, organization, and specific creation details are unknown. The number of rows, file formats, and license information are also unspecified.

TabularExplorationMoviesData Cleaning+1

0 views

Media & Communication

SHU GUANG 06: Ocean Pressure Measurements from 1980

NOAA_NCEI provides pressure and water data collected from the vessel SHU GUANG 06 over a three-day period in July 1980. Columns suggest this dataset likely contains oceanographic measurements, potentially including depth or salinity readings. Its presence on NASA EarthData indicates it is part of a broader environmental data archive.

TabularTime SeriesOceanographyShip DataNoaa NceiWater Pressure+1

0 views

Media & Communication

MovieLens1M: Movie Metadata Including Genres, Cast, and Overviews

MovieLens1M movies' metadata includes genres, cast, and overviews. The dataset is hosted on Kaggle, but details on the number of rows, columns, and specific file formats are not provided. The original author, organization, and last update date are unknown.

TabularOverviewsMoviesCastGenres+1

0 views

Media & Communication

Comma Video Compression PR Archive: Deduplicated Scored Pull Request Corpus

Kaggle hosts a deduplicated corpus of public pull requests related to video compression. The raw description indicates the data has been scored, suggesting it may contain metrics or labels for analysis. The dataset's origin, size, and specific content require verification after download.

TextCode ReviewPull RequestsNatural Language ProcessingSoftware DevelopmentVideo Compression+1

0 views

Media & Communication

Review Checkpoints: Model Evaluation Data from Kaggle

A dataset titled 'review-chekpoints--2026-04-30--13239-13239' was published on Kaggle. The title suggests it may contain evaluation data or checkpoints related to model reviews. No further metadata, such as column descriptions, sample data, or author information, is available.

TabularMachine LearningModel EvaluationReview Checkpoints+1

0 views

Media & Communication

News Recommendation Data with User Location and Engagement Signals

A dataset for news optimization containing user behavior, location signals, and engagement metrics. The dataset was sourced from Kaggle, but the author, organization, and specific collection details are unknown. The last update date and temporal coverage are also unspecified.

TabularEngagement MetricsUser BehaviorLocation Signals+1

0 views

Media & Communication

Multi-Topic Twitter Posts for Engagement and Trend Analysis

Twitter Engagement Dataset is a collection of multi-topic Twitter posts intended for engagement and trend analysis. The dataset was sourced from Kaggle, but its author, size, and last update date are unknown.

TextTwitterTrend AnalysisSocial MediaEngagement Analysis+1

0 views

PreviousPage 391 of 552Next