DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

10,956 datasets

Trending Movies Data from Kaggle

Trending movies data published on the Kaggle platform. The dataset's specific content, such as movie titles, ratings, or release dates, is not detailed in the available metadata. Its size, structure, and collection methodology are unknown.

TabularTrendingMoviesEntertainment+1

0 views

Media & Communication

CoderForge-Preview: Open Test-Verified Coding Agent Dataset

CoderForge-Preview is a test-verified coding agent dataset containing between 100,000 and 1,000,000 records, released by togethercomputer in February 2026. It provides trajectories for training software engineering agents and has demonstrated a performance increase from 23.0% to 59.4% pass@1 on the SWE-Bench Verified benchmark when used for fine-tuning.

OPTIMIZED-PARQUETParquetLibrarypolarsLibrarydaskModalitytextSize Categories100 Kn1 MLibrarymlcroissantLibrarydatasetsRegionus+1

0 views

Media & Communication

Deep Ocean Electric Field and Conductivity Measurements from Atlantis II

Woods Hole Oceanographic Institution collected electric field, temperature, and conductivity data from the ATLANTIS II research vessel from November 14 to 21, 1971. Electric field was recorded every half-second, while temperature and conductivity were recorded every second using a freely-falling and rising rotating vehicle in the deep ocean. The data were processed using QDEM and PROFEQ computer programs to derive east and north components of water velocity and create an equally spaced pressure series.

Time SeriesOceanographyMarine SensorsPhysical OceanographyOcean Currents+1

0 views

Media & Communication

SUPERChem: A Multimodal Reasoning Benchmark in Chemistry

SUPERChem is a multimodal reasoning benchmark dataset for chemistry. The dataset was created by ZehuaZhao and was last updated on March 31,我们发现了一个问题，请稍后再试。

MultimodalAi EvaluationBenchmarkChemistryMultimodal Reasoning+1

0 views

Media & Communication

Biological Reconnaissance of Bunger Hills from 1977 Expedition

A 1977 Australian National Antarctic Research Expedition visited the Bunger Hills on March 2nd to collect biological and geological samples. The dataset is a scanned report summarizing the findings and reviewing earlier scientific work in the region by other nations. It was created by R.J. Barker and sourced from the Australian Antarctic Data Centre.

TextBiological samplesExpedition ReportAntarctic ResearchGeological Samples+1

0 views

Media & Communication

Permit Review Time Data

CSVXMLJSON+1

0 views

Media & Communication

Cinatomy: Movie Ratings Across 25+ Viewer Criteria

Kaggle hosts a dataset of 1,000 films scored across more than 25 criteria. The criteria are described as factors that viewers actually care about. The author, organization, and specific column definitions are not provided.

TabularFilm RatingsUser CriteriaMovie Analysis+1

0 views

Media & Communication

ChatGPT and Gemini App Reviews in Indonesian for Sentiment Analysis

User reviews in Indonesian collected from the Google Play Store. The dataset is intended for sentiment analysis research and focuses on reviews for the ChatGPT and Gemini applications. The dataset was sourced from Kaggle, but the author, organization, and specific collection date are unknown.

TextGeminiApp ReviewsChatGPTSentiment AnalysisIndonesian LanguageNatural Language Processing+1

0 views

Media & Communication

Real Estate Agent Industry Expenditure Percentages for Canada

Canadian data from Statistics Canada details industry expenses as a percentage of total operating costs for real estate agents and brokers (NAICS 53121). The dataset provides annual data covering two years.

0 views

Media & Communication

GSE45827: Breast Cancer Gene Expression with 6 Subtypes, 151 Samples

54676 gene expression measurements from 151 breast cancer tissue samples, curated into 6 cancer subtype classes. The data originates from the CuMiDa repository, which handpicked and preprocessed this dataset from the GEO database for machine learning benchmarking. CuMiDa's curation process involved steps like sample quality control, background correction, and normalization to create a reliable source.

TabularGene ExpressionOncologyBenchmarkBioinformaticsGenomicsBreast cancerMicroarray+1

0 views

Media & Communication

GSE50161: Brain Cancer Gene Expression with 5 Subtypes, 130 Samples

A gene expression dataset for brain cancer containing 130 samples and measurements for 54,676 genes, curated into 5 classes. It originates from the CuMiDa repository, which provides handpicked and biologically preprocessed microarray datasets from the Gene Expression Omnibus (GEO) for machine learning. The dataset is referenced in computational biology publications from 2019.

TabularCancer ResearchGene ExpressionBrain CancerBenchmarkBioinformaticsMedical ResearchMicroarray+1

0 views

Media & Communication

Town of Cary Development and Subdivision Plans with Interactive Map

Town of Cary, North Carolina provides a dataset of site and subdivision plans. The data includes projects under review, recently approved, or actively being constructed and is updated as needed. The dataset is associated with an interactive development map on the town's website.

TabularGeospatialCSVJSONUrban DevelopmentSitesDevelopmentLand UseConstructionSite PlansSubdivisions+1

0 views

Media & Communication

Cultural Intelligence Agents With Specialized Identities

3,154 autonomous cultural intelligence agents are provided, each representing a deployable specialist. The collection was created by author 'joannaslh' and was last updated in April 2026. Each agent contains structured identity files, vocabulary, and cultural reference lists.

TextGraphAudioAutonomous AgentsCultural IntelligenceCreative Commerce+1

0 views

Media & Communication

Autonomous Cultural Intelligence Agents With Specialized Identities

3,154 autonomous cultural intelligence agents are available, each designed as a deployable specialist. The collection was created by joannaslh and was last updated in April 2026. Each agent contains a defined identity, vocabulary, cultural references, source monitoring, territory, and LIGO commerce routing information.

TextAudioDigital IdentityOpen Source CommerceAutonomous AgentsCultural Intelligence+1

0 views

Media & Communication

Nepal Tourism Reviews for Sentiment Analysis

Nepal Tourism Reviews is a dataset hosted on Kaggle. The dataset likely contains user-generated reviews related to tourism in Nepal. The specific content, size, and collection details are not provided in the available metadata.

TextTourismReviewsSentiment AnalysisNepal+1

0 views

Media & Communication

IMDB Movies List

IMDB_Movies_list is a dataset published on the Kaggle platform. The title suggests it contains a list of movies, likely sourced from the Internet Movie Database. The specific contents, scale, and creation details are not provided in the available metadata.

TabularMoviesFilm IndustryImdbEntertainment+1

0 views

Media & Communication

Urdu Education and Cultural Reasoning Dataset

Urdu language data likely related to educational or cultural reasoning tasks. The dataset is published on Kaggle, but its specific contents, size, and creation details are not provided in the metadata. Users must download the dataset to verify its exact nature and scope.

TextCultural ReasoningUrdu LanguageEducationNatural Language Processing+1

0 views

Media & Communication

Mental Health Prediction Dataset for Anxiety, Depression, and Burnout

Mental Health Prediction Dataset is hosted on Kaggle. It is designed to predict anxiety, depression, and burnout from lifestyle factors. The dataset's author, organization, size, and last update date are unknown.

TabularMental HealthAnxiety PredictionHealthcareBurnout RiskLifestyle Factors+1

0 views

Media & Communication

Water Corporation Drainage Pump Station Locations

2026-03-24 updated dataset from the Water Corporation detailing drainage pump station locations. It includes pressure adjustment points for water management infrastructure. Specific row counts and column details are unavailable.

PumpDrainageWater CorporationDrain+1

0 views

Media & Communication

Water Corporation Pressure Adjustment Point Locations

Water Corporation pressure adjustment point data, published by Asset Registration. The dataset includes geographic features for pump stations and related infrastructure. It was last updated in March 2026.

PumpWaterWater Corporation+1

0 views

PreviousPage 288 of 547Next