DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

11,035 datasets

Xiph.Org Test Media: Uncompressed Video for Compression Research

Xiph.Org Test Media is a collection of uncompressed video files hosted on AWS Open Data. The dataset is provided by Xiph.org, a non-profit organization supporting open multimedia standards. It is intended for research in video compression and video processing algorithms.

VideoImagingMoviesComputer VisionMediaMultimediaImage Processing+1

0 views

Media & Communication

Replication Data for Conspiracy Thinking and Partisan Beliefs

Replication Data for the study 'Conspiracy Thinking and Belief in Partisan Conspiracy Theories: A Moderating Effect of Partisan Congruence?' was authored by Omer Yair and submitted for review. The dataset is hosted by Harvard Dataverse and was last updated on March 17, 2026. It is tagged within the Social Sciences domain.

TabularConspiracy TheoriesSocial SciencesPsychologyPolitical ScienceReplication Data+1

0 views

Media & Communication

train_news: Russian News Articles for Topic Prediction Competition

A collection of news articles for the Russian-language competition 'Prediction of News Topics [AI 25/26]'. The dataset is hosted on Kaggle and appears to be designed for a supervised text classification task. The author, organization, and specific data volume are unknown.

TextMachine LearningNews ClassificationCompetitionText Data+1

0 views

Media & Communication

New York City Film Permits for Public Property Use

New York City film permits detail authorized exclusive use of public property like sidewalks, streets, and parks. The dataset is provided by the City of New York's Mayor's Office of Media and Entertainment (MOME) and was last updated in March 2026. Specific row and column counts are not provided in the input.

PermitsMomeFilm+1

0 views

Media & Communication

MovieLens 32M: Movie Ratings for Recommender Systems

MovieLens_32M is a dataset hosted on Kaggle, likely containing user ratings for movies. The title suggests it contains 32 million data points, which is a substantial scale for training models. Its specific contents, such as user and movie identifiers, require verification after download.

TabularRecommender SystemsMovie RatingsCollaborative Filtering+1

0 views

Media & Communication

Top Rated Movies from TDBM, 1896-2026

A collection of top-rated movies, likely sourced from the TDBM platform. The raw description indicates the dataset spans films from 1896 to 2026. It is hosted on Kaggle, but other metadata such as author, license, and specific column details are not provided.

TabularFilm RatingsMoviesEntertainment+1

0 views

Media & Communication

Deep IoT Cultural Heritage Restoration Data: Chinese Artifact Digital Conservation Records

Deep IoT Cultural Heritage Restoration Data is a dataset from Kaggle concerning the digital conservation of cultural artifacts. The raw description indicates it contains records related to Chinese artifact digital conservation, but specific details on size, format, and structure are unavailable. The dataset's author, organization, and last update date are unknown.

MultimodalArtifact RestorationIotCultural HeritageDigital Conservation+1

0 views

Media & Communication

Netflix Movies Catalog

Kaggle hosts a dataset titled 'Netflix movies'. The dataset likely contains information about movies available on the Netflix streaming platform. Metadata is minimal; actual content requires verification after download.

TabularMoviesMediaEntertainment+1

0 views

Media & Communication

Agilex Cobot Magic Robot Teleoperation Data

Featuring raw teleoperation data collected using the Agilex Cobot Magic robot, formatted for the Lerobot2.1 framework. It is part of the Great March 100 (GM-100) Project and is authored by rhos-ai.

ParquetHDF5LibrarypolarsLibrarydaskSize Categories10 Mn100 MModalitytimeseriesModalitytabularLibrarymlcroissantLibrarydatasetsArxiv260111421Imitation LearningModalityvideoRegionusLicensemit+1

0 views

Media & Communication

Cross-Country Agricultural Wage and Education Disparities

This dataset documents wage and human capital differences between agriculture and other sectors across 13 countries, including Canada, the U.S., India, and Indonesia. It contains data on average wages, worker education levels, and Mincer returns to education. The data is used to derive implied barriers to labor reallocation out of agriculture.

PSID+1

0 views

Media & Communication

Cross-Country Analysis of Financial Development Determinants

This dataset supports a cross-sectional study investigating the determinants of financial development across countries. The analysis uses an instrumental variables approach to examine the roles of cultural values, institutional quality, and trade openness.

0 views

Media & Communication

Cross-Country Analysis of Financial Development Determinants

0 views

Media & Communication

U.S. Railroad Bailouts and Economic Effects 1932-1939

This dataset analyzes over $1.1 billion in loans provided to 50 U.S. railroads by the Reconstruction Finance Corporation and Public Works Administration between 1932 and 1939. It examines the bailouts' effects on employment, wages, firm debt, bond default, and spillover benefits to nearby manufacturing firms.

0 views

Media & Communication

U.S. Railroad Bailout Loans from 1932 to 1939

Over $1.1 billion in loans provided by the Reconstruction Finance Corporation and Public Works Administration to 50 U.S. railroads between 1932 and 1939. It was created by Gertjan Verdickt to analyze the effects of these bailouts on employment, wages, firm debt, and bond default.

0 views

Media & Communication

Instagram Political Ad Experiment Participants in 2020 U.S. Election Study

Meta Platforms, Inc. and academic researchers collected data on participant treatment assignment and engagement with civic content on Instagram during the 2020 U.S. election. The dataset focuses on a political ads holdout experiment, measuring exposure to social issues, elections, and political ads for control group participants.

Social MediaElectionsPolitical AttitudesPolitical Behavior+1

0 views

Media & Communication

U.S. 2020 Facebook Political Ad Holdout Experiment Participants

U.S. 2020 Facebook and Instagram Election Study data contains information on participants in a political advertising holdout experiment. It includes treatment assignments, engagement with civic content, and for control group participants, exposure to social issues, elections, and political ads. The study was conducted by Meta Platforms, Inc. in partnership with academic researchers.

Social MediaElectionsPolitical AttitudesPolitical Behavior+1

0 views

Media & Communication

Firm and Establishment Data from the Great Depression Manufacturing Census

Comprising establishment-level data from the U.S. Census of Manufactures, used to study how multi-plant firms allocated resources in response to local economic shocks during the Great Depression. It was authored by Nicolas Ziebarth and last updated in February 2026.

Great Depression 1929BusinessesOtherestablishmentsManufacturing Industry+1

0 views

Media & Communication

Firm and Establishment Data from the Great Depression Manufacturing Census

A collection of establishment-level data from the U.S. Census of Manufactures, used to study how multi-plant firms allocated resources during the Great Depression. It was created by Nicolas Ziebarth to analyze the geographic propagation of local economic shocks through firm networks. The specific row count, column count, and file size are not provided in the input.

Great Depression 1929BusinessesOtherestablishmentsManufacturing Industry+1

0 views

Media & Communication

Telegraph Network Growth and Voter Turnout in America 1840-1852

Featuring digitized data on the expansion of the electric telegraph network in America from 1840 to 1852, used to study its impact on national elections and news coverage. It was created by Tianyi Wang for research analyzing how telegraph access influenced voter turnout and newspaper content.

Information TechnologyNews MediaElectionsEconomic History+1

0 views

Media & Communication

Telegraph Network Expansion and Voter Turnout in America 1840-1852

A source of replication data for a study on the impact of the electric telegraph on national elections in America from 1840 to 1852. It contains newly digitized data on the telegraph network's growth, used in a difference-in-differences analysis to measure effects on voter turnout and newspaper content.

ElectionInformation TechnologyNews MediaEconomic History+1

0 views

PreviousPage 388 of 551Next