DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

11,013 datasets

TikTok Dataset from Kaggle

TikTok data published on Kaggle, a popular platform for data science and machine learning projects. The dataset's specific content, size, and collection method are not detailed in the available metadata. Users must download the data to verify its structure and potential for analysis.

TabularSocial MediaVideo ContentTiktok+1

0 views

Media & Communication

AI Futures 2025: 3,706 Labeled AI Industry Forecasting Questions

3,706 forecasting questions regarding AI industry developments, model releases, and agentic deployments were compiled by LightningRodLabs. The collection spans January 2025 to January 2026 and features binary outcomes verified through web search using a 'Future-as-Label' methodology.

ParquetSize Categories1 Kn10 KTask Categoriestext GenerationLibrarypolarsLanguageenModalitytextLibrarymlcroissantLibrarydatasetsLibrarypandasArtificial IntelligenceRegionusForecastingFuture As LabelArxiv260106336LicensemitPrediction+1

0 views

Media & Communication

BRCA METABRIC: Clinical and Gene Expression Z-Score Data

A dataset from Kaggle containing clinical and gene expression information for breast cancer patients. The title suggests it likely contains z-score normalized gene expression values alongside clinical variables. The specific source, time range, and collection method are not detailed in the provided metadata.

TabularGene ExpressionOncologyBiomarkersBreast cancerClinical Data+1

0 views

Media & Communication

E-Commerce Product Reviews Dataset

A collection of e-commerce product reviews, likely containing customer feedback text and associated metadata. The dataset is hosted on Kaggle, but its specific origin, size, and creation date are not detailed in the available metadata. Columns and sample data are unknown, limiting immediate assessment of its content and structure.

TextTabularE CommerceSentiment AnalysisCustomer FeedbackProduct Reviews+1

0 views

Media & Communication

U.S. Voter News Discernment Surveys and Political Story Analysis

This dataset combines 11 monthly surveys with 15,000 total participants to investigate news discernment patterns in the U.S. It measures how informed voters are about political news, finding that 47% of subjects confidently choose a true story over a fake one, while 3% choose the fake. The analysis links discernment to socioeconomic differences and partisan congruence.

0 views

Media & Communication

British Geological Survey Offshore Cultural Data Theme

2026-03-19 updated data from the British Geological Survey's GeoIndex Offshore cultural data theme. The dataset provides marine geology and digital map information for the UK and other global areas via a free web service interface.

GeologyNerc DdcMarine GeologyDigital maps+1

0 views

Media & Communication

Copthall Sports Hub Draft Masterplan Documents from Barnet Council

A collection of draft masterplan documents for the Copthall Sports Hub and Mill Hill Open Spaces in London. The documents include full and summary versions referenced in Environment Committee reports from March and September 2018. The consultation report contains embedded PDF responses from the public.

0 views

Media & Communication

Movie Data Collection from Kaggle

A dataset titled 'Movies' published on the Kaggle platform. The dataset likely contains information related to films, but specific details such as columns, size, and origin are unknown. Users must inspect the actual content after download to verify its scope and utility.

TabularMoviesFilmEntertainment+1

0 views

Media & Communication

Reddit Weight Loss Narratives Annotated with the COM-B Behavioral Framework

A large-scale corpus of weight loss barriers constructed from Reddit narratives. The data is annotated using the COM-B behavioral framework, which categorizes barriers related to capability, opportunity, and motivation. The dataset was sourced from Kaggle, but the author, organization, and specific size details are unknown.

TextWeight LossLarge ScaleNatural Language ProcessingBehavioral ScienceReddit NarrativesCom B Framework+1

0 views

Media & Communication

Popular Movies Dataset from Kaggle

A dataset listing popular movies from around the world, sourced from Kaggle. The specific number of records, features, and time period covered are not detailed in the available metadata. Users should verify the actual content and scope after download.

0 views

Media & Communication

Hatred Tweets During the MeToo Movement on Twitter with User Metrics

Hatred-on-Twitter-During-MeToo-Movement is a dataset of tweets from the MeToo movement, labeled for hatred and non-hatred content. The data includes tweet text, timestamps, and user engagement metrics like retweet and favorite counts. It was sourced from openml under a CC-BY-NC-SA-4.0 license.

TextTabularTwitterMetoo MovementSocial MediaSocial MovementsText ClassificationHate Speech+1

0 views

Media & Communication

Movie V7: Film Metadata and Attributes

Movie V7 is a dataset uploaded to Hugging Face by author vanduc11. The dataset's specific content and scale are unknown, but the title suggests it contains information related to films. It was last updated on April 1, 2026.

TabularMoviesRegionusMediaEntertainment+1

0 views

Media & Communication

OME-Zarr Scientific Visualization Datasets for Tool Development

A collection of scientific visualization datasets converted to a chunked, multi-scale OME-Zarr format and hosted on AWS S3. The project is provided by NumFOCUS under a CC-BY-4.0 license through the AWS Open Data Program. It aims to serve as a web-based resource for the scientific visualization community.

ImageMultimodalScientific VisualizationOme ZarrZarrVolumetric ImagingMagnetic Resonance ImagingBiologyImagingLife SciencesNeuroimagingNeuroscienceComputed Tomography+1

0 views

Media & Communication

Fake Job Postings News Dataset

A dataset of job postings, likely containing both real and fraudulent listings. It was published on Kaggle, but the specific collection date, author, and data volume are unknown. The dataset's primary purpose appears to be for training models to identify deceptive employment advertisements.

TextTabularFake NewsText ClassificationFraud DetectionJob Postings+1

0 views

Media & Communication

Wearable Art Therapy Depression Dataset

Multisensory mental health monitoring data, likely collected from wearable devices during art therapy sessions. Published on Kaggle, the dataset's specific size, collection date, and author are unknown. It appears to combine physiological signals with therapeutic activity data.

MultimodalMental HealthArt TherapyWearable SensorsHealthcareMultisensory Data+1

0 views

Media & Communication

Twitter Tianxinkitten 2025.04.23: Image Collection Part 1

A collection of images uploaded to Twitter by user 'Tianxinkitten' on April 23, 2025. The dataset, authored by 'daaxila', was last updated on the Hugging Face platform in April 2026. The exact number of images and their content is not specified in the available metadata.

ImageIMAGEFOLDERTwitterSize Categoriesn1 KLibrarymlcroissantSocial MediaUser ContentModalityimageLibrarydatasetsRegionus+1

0 views

Media & Communication

Northern Hemisphere Sea-Level Pressure Grids from 1880 to 1979

1880 to 1979 daily sea-level pressure data for the Northern Hemisphere, provided on a 10-degree by 5-degree latitude/longitude grid. The dataset was produced by the organization SCIOPS and is hosted on the nasa_earthdata platform.

Time SeriesGeospatialHistorical WeatherGridded DataSea Level PressureClimate DataNorthern Hemisphere+1

0 views

Media & Communication

Chukchi Sea Oceanographic CTD Measurements from September 1996

September 1996 measurements of temperature, salinity, conductivity, pressure, and transmissivity collected using a CTD instrument from the R/V Alpha Helix in the Chukchi Sea. The data is provided by NOAA NCEI under accession number 0061042. The dataset represents a snapshot of Arctic oceanographic conditions during a late summer cruise.

TabularTime SeriesCtd MeasurementsOceanographySalinityTemperatureChukchi Sea+1

0 views

Media & Communication

spacespress_ADL_2026: Activities of Daily Living Data

spacespress_ADL_2026 is a dataset published on Kaggle. Its title suggests a focus on Activities of Daily Living, which are tasks related to personal care and routine. The dataset's specific content, size, and collection details are not provided in the available metadata.

TabularActivity RecognitionAdlSensor Data+1

0 views

Media & Communication

IMDB Movie Reviews from 2019

Imdb 2019 Movie Reviews is a text dataset published on the Hugging Face platform by author gosaeng101. The dataset likely contains user reviews for movies from the IMDb database, focusing on the year 2019. The last recorded update to the dataset listing was on 2026-04-06 17:21:05.

TextSentiment AnalysisText ClassificationMovie ReviewsImdb+1

0 views

PreviousPage 361 of 550Next