DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

11,012 datasets

News and Entertainment Content Dataset

A dataset of news and entertainment content published on HuggingFace by author Sachin21112004. The dataset was last updated on 2026-04-05. The specific volume, source, and temporal coverage of the content are not detailed in the available metadata.

TextNewsMedia ContentEntertainment+1

0 views

Media & Communication

Persona AF Elicitation: 450 Conversations Testing Alignment Faking in Gemma 3 27B-it

450 conversations designed to test whether persona framing gates the expression of alignment faking (AF) in the Gemma 3 27B-it language model. The dataset was created by author vincentoh and last updated on March 6, 2026. It includes 15 roles, 10 AF elicitation prompts, and 3 experimental conditions, with responses judged by Claude Opus.

TextJSONSafetyLibrarypolarsAlignmentLanguageenSize Categoriesn1 KModalitytextModalitytabularLibrarymlcroissantLibrarydatasetsArxiv260110387LibrarypandasText ClassificationAlignment FakingLlm SafetyRegionusTask Categoriestext ClassificationLicensemitPersona ElicitationElicitation+1

0 views

Media & Communication

Review Checkpoints: Text Data for Analysis

A dataset titled 'review-chekpoints--2026-05-20--13259-13259' was published on Kaggle. The title suggests it may contain review data, potentially for analysis or model training. The specific content, scale, and origin are unconfirmed due to minimal metadata.

TextReview AnalysisCheckpointsText Data+1

0 views

Media & Communication

Ocean-Floor Volcanism Evidence from Wyalong, New South Wales

Geoscience Australia Research Newsletter 28 presents new evidence on ocean-floor volcanism in the Lachlan Fold Belt, focusing on the Wyalong area in New South Wales. The dataset consists of a scientific journal paper published by Geoscience Australia, available in PDF and HTML formats.

External PublicationEarth sciencesAU-NSWAU-VICScientific Journal PaperMarinePublished External+1

0 views

Media & Communication

UK and Denmark Political Agenda-Setting Data with 6.6M Records

This dataset comprises over 6.6 million records including 5.5 million tweets, 750,000 news articles, and 419,000 parliamentary questions from the UK and Denmark. Collected by Daniel Sandvej Eriksen for the American Political Science Review, the data spans 2015 to 2022 to track how political parties initiate and elevate agendas. It provides a multi-channel view of political discourse across social media, mainstream news, and official government proceedings.

Social Sciences+1

0 views

Media & Communication

newstt: News Text Data Collection

newstt is a dataset hosted on Kaggle. The dataset's title suggests it contains news-related text data. No further descriptive metadata, column information, or sample data is available for verification.

TextNewsMedia ContentText Data+1

0 views

Media & Communication

News Articles in Prachalit Script from Newa Sources

A collection of news content written in the Prachalit script, which is used for the Nepal Bhasa (Newar) language. The dataset is hosted on Kaggle, but its specific source, size, and collection date are not detailed in the provided metadata. The content likely contains articles from Newa news outlets, though the exact scope and volume require verification after download.

TextNews ArticlesNepali LanguagePrachalit ScriptText Corpus+1

0 views

Media & Communication

Trending Movies Data from Kaggle

Kaggle hosts a dataset titled 'Trending Movies'. The dataset likely contains information on films that are currently popular or gaining attention. Specific details on its contents, size, and origin are not provided in the available metadata.

0 views

Media & Communication

Fake News Bangla Dataset KHR: Bengali Language News Articles

A Kaggle dataset titled 'Fake_news_bangla_dataset_KHR' likely contains text data in the Bengali (Bangla) language related to news articles. The dataset's content and structure are inferred from its title, as no detailed metadata is provided. Its author, size, and specific creation details are unknown.

TextMedia AnalysisFake NewsText ClassificationBangla+1

0 views

Media & Communication

Tamil Movie Box Office Collection Data for Machine Learning

A dataset related to Tamil cinema box office collections. It is published on Kaggle and is intended for machine learning applications. The specific source, collection method, and temporal coverage are not detailed in the provided metadata.

TabularBox OfficeTamil CinemaEntertainment DataMovie Performance+1

0 views

Media & Communication

The Office TV Series Data from IMDb, 188 Episodes

IMDb provides the source for this dataset, which contains 188 rows of information about the American mockumentary sitcom 'The Office'. The series depicts the everyday lives of office employees at the fictional Dunder Mifflin Paper Company in Scranton, Pennsylvania. The dataset is released under a CC0 1.0 license.

TabularCharacter InteractionDialogueTv SeriesEntertainment+1

0 views

Media & Communication

SSA Unified Measurement System (SUMS) Continuing Disability Review - Operation Data Store

Operational data from the Social Security Administration's Unified Measurement System (SUMS) concerning Continuing Disability Reviews (CDRs). The dataset stores information related to the process of reviewing individuals' eligibility for disability benefits. It was last updated on March 10, 2026.

TabularSocial SecurityDisability BenefitsCdrDisabilityContinuing Disability Review+1

0 views

Media & Communication

Global Ocean Surface CO2 Partial Pressure Measurements from 1968-2008

Approximately 4.5 million measurements of surface water partial pressure of CO2 collected over the global oceans between 1968 and 2008. The data, assembled by the Lamont-Doherty Earth Observatory (LDEO), includes open ocean and coastal measurements from equilibrator-CO2 analyzer systems and has undergone quality control. It is available as a numeric data package from the Carbon Dioxide Information Analysis Center (CDIAC).

TabularTime SeriesEnvironmental scienceOceanographyComputer SciencePartial PressureCarbon dioxideSurface WaterDatabaseChemistryEnvironmental EngineeringLarge Scale+1

0 views

Media & Communication

Electronic Records Express: Management Information Volume Reporting

Social Security Administration data stores information for reporting on the number of electronic records processed through the Electronic Records Express website and at each Front End Capture System. The dataset was last updated on March 10, 2026. It likely contains operational metrics for tracking digital intake volumes across different capture points.

TabularManagement InformationReportingElectronic Records+1

0 views

Media & Communication

Disability Quality Review (DQR): Management Information for Case Reviews

Social Security Administration's Disability Quality Review (DQR) dataset stores information about the review process associated with disability cases. The dataset was last updated on March 10, 2026. It is published on the Data.gov platform under an unspecified license.

TabularSocial SecurityCaseDqrReview ProcessDisability Quality ReviewAdministrative ProcessCase ManagementDisabilityQuality Assurance+1

0 views

Media & Communication

Adespatial: Multivariate Multiscale Spatial Analysis Tools

Adespatial provides tools for the multiscale spatial analysis of multivariate data. The methods are based on a spatial weighting matrix and its eigenvector decomposition, known as Moran's Eigenvectors Maps (MEM). The approach is described in the review by Stéphane Dray et al. (2012).

TabularSpatial AnalysisMachine LearningComputer ScienceMultivariate StatisticsMultivariate AnalysisStatisticsMultiscale Analysis+1

0 views

Media & Communication

Google Play Store App Reviews with Sentiment Labels

playstore-reviews-sentiment is a dataset from Kaggle. It likely contains user reviews for mobile applications from the Google Play Store, annotated with sentiment labels. The dataset's specific size, author, and collection period are not provided in the available metadata.

TextTabularApp ReviewsSentiment AnalysisNatural Language ProcessingGoogle Play Store+1

0 views

Media & Communication

Trending Movies Over Years

A dataset listing movies that gained popularity over time. It is hosted on Kaggle, but the specific time range, data source, and collection method are not provided in the metadata. The dataset's author, organization, and last update date are also unknown.

TabularTime SeriesMoviesTrendsEntertainment+1

0 views

Media & Communication

Fake News Dataset 1 for Content Verification

A dataset concerning fake news, published on Kaggle. The specific content, size, and collection methodology are unknown. The dataset's author, organization, and last update date are not provided.

TextMedia AnalysisFake NewsText Classification+1

0 views

Media & Communication

Entertainment Tax Growth Values for Sioux Falls, South Dakota

Authoritative entertainment tax growth values for Sioux Falls, South Dakota. The dataset is published by the City of Sioux Falls and was last updated on March 22, 2026. It is available in multiple formats including CSV and GeoJSON.

TabularGeospatialZIPCSVSioux FallsEntertainment TaxTax RevenueMunicipal FinanceUnited States+1

0 views

PreviousPage 353 of 550Next