DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

10,964 datasets

Trending Movies with Popularity Scores from TMDB

A curated collection of trending movies sourced from The Movie Database (TMDB). The description indicates the data includes popularity scores for movies, suggesting a focus on measuring public interest over time. The dataset's author, organization, and specific temporal coverage are unknown.

TabularMovie TrendsPopularity ScoresTmdbEntertainment Data+1

0 views

Media & Communication

TikTok App Reviews from 2026

Tiktok APP Review 2026 is a dataset published on Kaggle. The title suggests it contains user reviews for the TikTok application from the year 2026. The dataset's specific content, size, and origin require verification after download.

TextApp ReviewsSocial MediaUser FeedbackTiktok+1

0 views

Media & Communication

Residential Roofing Integrated Photovoltaic Installation Time Observations in California

A collection of 21 field observations of residential roofing integrated photovoltaic installations in California, with 2 observations for re-roofing projects and 19 for new construction projects. It documents detailed time and motion data for installation activities collected between July 2021 and June 2022 by the Department of Energy.

Building EnergyBuilding Roof AreaPhotovoltaicRoofRooftop PvBuilding Integrated PvBuildingsRooftop SolarRoofing Integrated PvRipvBipv+1

0 views

Media & Communication

Extended Shine-Dalgarno Motifs in Staphylococcus Aureus

290,100,190 bytes of source data include unprocessed toeprinting films, raw luciferase measurements for reporter assays, polysome profiling data, and raw biofilm analysis data. The dataset supports research into how extended Shine-Dalgarno motifs govern translation initiation in the bacterium Staphylococcus aureus. It contains raw experimental outputs from multiple biochemical and genetic assays.

ImageTabularBiofilm AnalysisStaphylococcus AureusTranslation InitiationToeprintingShine Dalgarno+1

0 views

Media & Communication

Cosmos WTS Compress Prompt: Text Compression Dataset for AI Training

Cosmos WTS Compress Prompt is a dataset hosted on Kaggle. Its title suggests it contains text prompts related to compression tasks. The dataset's author, organization, and specific content details are unknown.

TextPrompt EngineeringText CompressionAi TrainingCosmos+1

0 views

Media & Communication

Synthetic E-Commerce Data for Recommendation and Graph ML

Synthetic data designed for building and testing recommendation systems and graph machine learning models. The dataset is hosted on Kaggle, but the author, organization, and specific data volume are unknown. Its last update date and licensing information are also not provided.

GraphE CommerceRecommendation SystemsGraph Machine LearningSynthetic DataSynthetic+1

0 views

Media & Communication

U.S. Transportation System Review: A 1995 Snapshot of Trends and Performance

A 31-page report summarizing the U.S. transportation system for the Bureau of Transportation Statistics. The publication, 'Transportation in the United States: A Review,' provides a snapshot highlighting physical characteristics and trends in passenger travel and freight movement. It examines the economic performance, safety record, and environmental impact of the system, which served 260 million people and 6 million businesses at the time.

TabularEconomic PerformanceFreight MovementTransportation StatisticsFinanceLarge ScaleUs TransportationSafety Record+1

0 views

Media & Communication

Synthetic Netflix-Style Content Catalog with 50,250 Records

50,250 synthetic records emulate a catalog of movies and TV series similar to Netflix. The dataset is hosted on Kaggle, but its author, license, and update history are not specified. Its synthetic nature suggests it was generated for modeling or analysis rather than sourced from a real service.

TabularMoviesContent CatalogTv SeriesSyntheticEntertainment+1

0 views

Media & Communication

MMDocIRT2ITRetrieval: Long Document Benchmark for Multimodal Retrieval

MMDocIRT2ITRetrieval is an evaluation dataset from the Massive Text Embedding Benchmark (MTEB). It contains 313 long documents averaging 65.1 pages, categorized into ten domains including research reports, academic papers, and government documents. The dataset features a multimodal distribution, with text comprising 60.4% of the content.

MultimodalOPTIMIZED-PARQUETParquetSize Categories10 Kn100 KTask Categoriesimage Text To TextSource Datasetsmmdoc Irmmdoc Ir Evaluation DatasetLibrarypolarsTask Categoriesimage To TextLibrarydaskDocument RetrievalTask Categoriestext To ImageModalitytextLibrarymlcroissantModalityimageTask Categoriesvisual Document RetrievalLibrarydatasetsBenchmarkLong DocumentsComputer VisionLanguageengFinanceLarge ScaleArxiv250108828MultilingualitymonolingualArxiv250213595Licenseapache 20Annotations Creatorsderived+1

0 views

Media & Communication

Facebook Dataset from Kaggle

A dataset related to the Facebook platform, sourced from Kaggle. The specific content, size, and creation details are not provided in the available metadata. Users must download the dataset to inspect its actual structure and contents.

TabularUser DataSocial MediaFacebook+1

0 views

Media & Communication

VN-FB News: Vietnamese Social Media News Classification Dataset

ViSocialNews is a dataset for classifying Vietnamese social media news. The dataset likely contains text posts from social media platforms, annotated for news classification tasks. Its author, organization, and specific size are unknown.

TextNews ClassificationSocial MediaText DataVietnamese Language+1

0 views

Media & Communication

Amazon Fake Review Labels

Amazon Fake Review Labled is a dataset hosted on Kaggle. The title suggests it contains Amazon product reviews with labels indicating authenticity. The dataset's author, organization, and specific details are unknown.

TextE CommerceFake Review DetectionText Classification+1

0 views

Media & Communication

Code Reviews Dataset from Kaggle

Code_reviews is a dataset hosted on the Kaggle platform. The dataset's title suggests it contains records related to the software code review process. No further descriptive metadata, sample data, or column definitions are available for verification.

TextSoftware EngineeringDeveloper ToolsCode Review+1

0 views

Media & Communication

Code Review Data for Machine Learning Applications

Code review data likely contains records of software code changes and associated review comments. The dataset is hosted on Kaggle, but its specific size, origin, and creation date are unknown. Columns and sample data are unavailable for verification.

TextMachine LearningSoftware EngineeringCode Review+1

0 views

Media & Communication

Atlantic Ocean CTD Profiles from the MEDDY Experiment

September 1984 to November 1985 data collection of temperature and salinity profiles from CTD casts in the Atlantic Ocean aboard the R/V Oceanus. This dataset was created by the University of Rhode Island's Graduate School of Oceanography for the Mediterranean Eddy (MEDDY) experiment. It represents a focused oceanographic campaign to study a specific mesoscale feature.

TabularTime SeriesOceanographyAtlantic OceanCtd ProfilesMediterranean Eddy+1

0 views

Media & Communication

Atlantic Ocean Bathythermograph Data from the 1986 WOCE Experiment

Four ships collected water depth and temperature profiles across the North and South Atlantic Ocean between August 14 and December 4, 1986. The dataset originates from the World Ocean Circulation Experiment (WOCE) and was submitted by Dr. Reiner Onken of the University of Kiel, Germany. Data is available in the NODC C125 Bathythermograph-XBT-Selected Depths file format.

TabularTime SeriesOceanographyOcean DepthAtlantic OceanBathythermograph+1

0 views

Media & Communication

Gulf of Mexico Oceanographic Data from SYNOP Project (1988-1990)

Over two years of pressure, temperature, and current velocity data were collected from the R/V ENDEAVOR and OCEANUS research vessels. The dataset was submitted by Thomas Shay of the University of North Carolina at Chapel Hill as part of the SYNoptic Ocean Prediction project. Measurements were taken via speed meter casts in the Gulf of Mexico.

TabularTime SeriesOceanographyGulf Of MexicoPhysical OceanographySynop Project+1

0 views

Media & Communication

Atlantic Ocean XBT Temperature Profiles from 1988-1995

Atlantic Ocean water temperature and pressure profiles collected from 1988 to 1995 via the BSH Ship-of-Opportunity Programme. The dataset contributes to the World Ocean Circulation Experiment, with principal investigation led by Dr. Alexander Sy of the Bundesamt für Seeschiffahrt und Hydrographie. It represents a multi-year collection of expendable bathythermograph (XBT) data.

Time SeriesGeospatialOceanographyShip Of OpportunityXbt ProfilesAtlantic Ocean+1

0 views

Media & Communication

Coastal California CTD Oceanographic Data from 1987 Cruise

Northeast Pacific Ocean data comprises CTD vertical cast measurements from the R/V New Horizon cruise CaBS7, collected off the California coast. The dataset captures seawater pressure, temperature, and salinity from October 16 to 23, 1987. Dr. Barbara Hickey of the University of Washington led the collection for the Southern California Bight Basin Study.

TabularTime SeriesOceanographyCoastal OceanCtd DataCalifornia Bight+1

0 views

Media & Communication

Movie Dataset for Recommendation Systems and Analysis

A movie dataset intended for building recommendation systems and performing data analysis. It originates from the Kaggle platform, but details on its creator, size, and specific contents are unspecified. The last update date is unknown.

TabularRecommendation SystemsMoviesEntertainment+1

0 views

PreviousPage 300 of 548Next