DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

10,995 datasets

DL_ohlcv_newssources: Financial Time Series with News Data

DL_ohlcv_newssources is a dataset from Kaggle. Its title suggests it combines financial market data, likely OHLCV (Open, High, Low, Close, Volume) metrics, with news sources. The dataset's actual content, scale, and origin require verification after download.

TabularTime SeriesOhlcvFinancial DataNews Sources+1

0 views

Media & Communication

Raw Forward Returns with OHLCV and News Source Data

Raw_fwd_return_DL_ohlcv_newssource combines financial market returns with price and news data. The dataset is hosted on Kaggle, but its specific origin, size, and creation date are not detailed in the available metadata. Columns likely contain forward-looking returns, open-high-low-close-volume (OHLCV) data, and indicators of news sources.

TabularTime SeriesNews SentimentOhlcvFinancial Markets+1

0 views

Media & Communication

Historical TikTok Gaming Dataset

Historical TikTok Gaming Dataset is a dataset published on Kaggle. The title suggests it contains data related to gaming content on the TikTok platform over a past period. The dataset's specific content, size, and collection details are unknown from the provided metadata.

TabularSocial MediaGamingHistorical DataTiktok+1

0 views

Media & Communication

Monitoring TikTok Real-time Dataset

A dataset containing real-time information related to the TikTok platform. It was published on Kaggle, but the specific collection date and author are unknown. The dataset's content, scale, and specific variables require verification after download.

TabularSocial MediaReal TimeMonitoringTiktok+1

0 views

Media & Communication

TikTok Search Gaming Dataset

TikTok Search Gaming Dataset is a collection of data related to search behavior on the TikTok platform, specifically within the gaming domain. The dataset is hosted on Kaggle, but its specific size, authorship, and update history are unknown. Columns likely contain information about search queries, user interactions, or content related to gaming topics.

TabularSearch BehaviorSocial MediaGaming+1

0 views

Media & Communication

News Publishing, Digital PR and SEO Backlink Data

Backlink data likely related to news publishing, digital public relations, and search engine optimization. The dataset is hosted on Kaggle, but its specific contents, size, and origin are not detailed in the provided metadata. The author, organization, and last update date are unknown.

TabularDigital PrSeo BacklinksNews Publishing+1

0 views

Media & Communication

TikTok and Instagram Engagement Prediction Data

Prediksi Engagement TikTok & Instagram is a dataset hosted on Kaggle. The dataset likely contains metrics for predicting user engagement on the TikTok and Instagram social media platforms. Its specific contents, size, and origin are not detailed in the available metadata.

TabularSocial MediaEngagement PredictionInstagramTiktok+1

0 views

Media & Communication

The Simpsons Season 1 Episode Dialogue in CPT JSON Format

13 episodes from the first season of the animated series The Simpsons provide the source text. The dataset is formatted in a native CPT JSON structure and was uploaded by SicariusSicariiStuff. The record was last updated on March 11, 2026.

TextSeason 1SimpsonsDialogueTv ScriptsEntertainment+1

0 views

Media & Communication

Southern Ocean CTD Data from Aurora Australis 1993 Cruise

CTD (conductivity, temperature, depth) data from 62 casts conducted during the Aurora Australis KROCK cruise from January to March 1993. The data, collected by the Australian Antarctic Data Centre (AU_AADC), supplements a krill and geology research program in the Prydz Bay region. Measurements include Pressure, Temperature, Salinity, and Sigma-T.

TabularTime SeriesOceanographyMarine ScienceSouthern OceanCtd Data+1

0 views

Media & Communication

BBC Topic-Feature Dataset from News RSS Feeds

A dataset derived from BBC News RSS feeds, likely containing text features for topic modeling. The description mentions Non-negative Matrix Factorization (NMF) was applied to the data. The author, organization, and specific scale are unknown.

TextText AnalysisNewsTopic ModelingRss Feeds+1

0 views

Media & Communication

Fake News Dataset with Real and Dramatized Articles

Real news articles scraped from various sources are paired with dramatized versions labeled as fake news. The dataset's author, size, and specific sources are not detailed in the provided metadata. Its creation method suggests it is intended for binary classification tasks in media analysis.

TextNews ClassificationFake NewsMedia ContentText Data+1

0 views

Media & Communication

News Dataset from Kaggle

News Dataset is a text corpus hosted on Kaggle. The dataset's specific content, size, and collection methodology are not detailed in the available metadata. Its source, author, and temporal coverage are unknown.

TextNewsMedia+1

0 views

Media & Communication

CLV Sportsbook Closing-Line Teaser Odds and Probabilities

Closing-line odds and no-vig probabilities for sportsbook teasers are available for preview at an external site. The dataset appears to contain betting market data, likely from August 2026. Its specific structure and volume are not detailed in the provided description.

TabularSportsbookSports BettingProbabilityClosing Line Odds+1

0 views

Media & Communication

Helicopter Collision Investigation Report from Quebec 2021

Transportation Safety Board of Canada provides an official investigation report for a 2021 helicopter accident. The report details a collision between a sling load and tail rotor involving an Airbus AS350 B2 operated by Héli-Express Inc. near Les Escoumins, Quebec on May 11, 2021.

0 views

Media & Communication

PS-Battles: 102,028 Images for Manipulation Detection

102,028 images are grouped into 11,142 subsets, each containing an original image and manipulated derivatives. The dataset was created by Silvan Heller of the University of Basel for research on media derivation and tampering detection. It was sourced from a large community of image manipulation enthusiasts.

ImageComputer SciencePattern RecognitionPattern Recognition PsychologyComputer VisionImage MathematicsArtificial IntelligenceImage ManipulationDigital Forensics+1

0 views

Media & Communication

Leosphere Windcube Lidar Data from Humboldt Buoy

Filtered and averaged lidar data from a buoy-mounted Leosphere Windcube 866 instrument, standardized into NetCDF format. The dataset is provided by Raghavendra Krishnamurthy of the Pacific Northwest National Laboratory. It includes parameters from various instruments on the buoy, with details on measurement frequency available in an attached data dictionary.

Time SeriesGeospatialPoint CloudGeographyWind EnergyOcean Buoy+1

0 views

Media & Communication

Mean Birds: Twitter Aggression and Bullying Detection with 1.6M Tweets

Despoina Chatzakou from Aristotle University of Thessaloniki presents a dataset for detecting bullying and aggressive behavior on Twitter. The corpus contains 1.6 million tweets posted over a 3-month period. The research proposes a methodology extracting text, user, and network-based attributes to distinguish bullies and aggressors from regular users.

TextTabularCyberbullyingTwitterBehavioral AnalysisPsychologySocial MediaAggressionNatural Language ProcessingSocial PsychologyAggression Detection+1

0 views

Media & Communication

Review Checkpoints: Model Evaluation Data from Kaggle

Review-checkpoints--2026-06-01--13271-13271 is a dataset hosted on Kaggle. The title suggests it likely contains evaluation metrics or saved states from a machine learning model training process. No further metadata, such as author, size, or columns, is provided.

TabularMachine LearningModel EvaluationReview Checkpoints+1

0 views

Media & Communication

Movies Dataset from Kaggle

A dataset of movies, likely containing information about films. It is published on the Kaggle platform. The specific content, size, and origin are unknown from the provided metadata.

TabularMoviesFilmEntertainment+1

0 views

Media & Communication

Ross Sea CTD Measurements from the AnSlope Program (2004)

2004 data from the AnSlope program in the Ross Sea, collected from the Nathaniel B. Palmer research vessel. The dataset contains oceanographic measurements including temperature, salinity, dissolved oxygen, and pressure to study dense water transfer and poleward flow. It is managed by NOAA NCEI and appears on multiple government data platforms.

TabularTime SeriesCtd MeasurementsOceanographyPhysical OceanographyRoss Sea+1

0 views

PreviousPage 329 of 549Next