DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

10,957 datasets

Fact-Checking Dataset for African Languages

AfrIFact is a dataset for cultural information retrieval, evidence extraction, and fact-checking in African languages. It was created by Masakhane and last updated on April 2, 2026. The dataset is designed to assess the veracity of online claims, particularly those concerning healthcare and culture in low-resource linguistic contexts.

TextParquetSize Categories1 Kn10 KLibrarypolarsModalitytextLibrarymlcroissantFact CheckingLibrarydatasetsLibrarypandasLicensecc By 40RegionusAfrican LanguagesMultilingual TextInformation RetrievalArxiv260400706+1

0 views

Media & Communication

PMAG Database: Peer-Reviewed Paleomagnetic and Geomagnetic Data

The PaleoMagnetic Archival Group (PMAG) Database contains peer-reviewed paleomagnetic, rock, and geomagnetic data from the nasa_earthdata platform. It includes measurements from magnetometers and model calculations. The database is a prototype and is still under construction.

TabularGeospatialEarth sciencesPaleomagnetismGeomagnetic Data+1

0 views

Media & Communication

TMDB Movie Metadata with Ratings and Popularity, 1957-2026

TMDB Movie Metadata includes ratings, popularity scores, and release dates for films from 1957 to 2026. The dataset is sourced from The Movie Database (TMDB) and is hosted on Kaggle. It is intended for analysis of film industry trends over a 70-year period.

TabularRatingsMovie MetadataPopularityRelease DateMoviesTmdb+1

0 views

Media & Communication

TMDB Movie Metadata with Ratings and Popularity, 2000-2026

TMDB movie metadata includes ratings, popularity scores, and release dates. The dataset covers films released between 2000 and 2026, sourced from The Movie Database platform. Its author, organization, and specific size are unknown.

TabularRatingsRelease DatesPopularityMoviesTmdb+1

0 views

Media & Communication

VVC Encoding QP 32: All Intra Mode Video Compression Data

Video compression data generated using the VVC (Versatile Video Coding) standard's VTM 24.0 reference software. The data was created using All Intra mode at a Quantization Parameter (QP) of 32. The dataset is intended for research in video compression algorithms and codec development.

VideoAll IntraVvcEncodingVideo Compression+1

0 views

Media & Communication

Brazilian Portuguese LGBTQIA+ Hate Speech Comments from Podcast Social Media

A dataset for detecting hate speech against LGBTQIA+ people in Brazilian Portuguese. It contains comments collected from three social media platforms related to the 'Entre Amigues' podcast. The dataset was created by Veronyka and was last updated on March 23, 2026.

TextAudioHate Speech DetectionSocial Media CommentsPortuguese LanguageLgbtqiaBrazilian Portuguese+1

0 views

Media & Communication

Amazon Product Reviews with Sentiment Polarity, 35 Million Records

Amazon reviews data from the Stanford Network Analysis Project (SNAP) includes 34,686,770 reviews from 6,643,669 users on 2,441,053 products spanning 18 years up to March 2013. The provided subset contains 1,800,000 training and 200,000 testing samples labeled with polarity. Authors Xiang Zhang, Junbo Zhao, and Yann LeCun published related research in 2015.

TextTabularE CommerceSentiment AnalysisText MiningLarge ScaleNatural Language ProcessingProduct Reviews+1

0 views

Media & Communication

THUCNews: Chinese News Articles for Text Classification

THUCNews is a text classification dataset published on Kaggle. The title suggests it likely contains Chinese news articles categorized for machine learning tasks. The dataset's author, organization, and specific details are not provided in the available metadata.

TextNews ArticlesText ClassificationChinese LanguageNatural Language Processing+1

0 views

Media & Communication

BBC News Articles Scraped from the Web

BBC News content collected via web scraping and published on Kaggle. The dataset likely contains news articles and headlines, though the specific volume, time period, and exact content are unconfirmed from the provided metadata.

TextWeb ScrapingNews ArticlesMedia ContentText Data+1

0 views

Media & Communication

IMDb Top 250 Movies Dataset

A list of 250 top-rated movies from the Internet Movie Database (IMDb). The dataset is published on Kaggle, though its specific creation date and update frequency are unknown. It likely contains information such as titles, ratings, and votes for each film.

TabularRatingsMoviesTop 250Imdb+1

0 views

Media & Communication

Samsung Doc Classifier: Movie Posters and Magazine Covers in Phone Scenes

Movie posters and magazine covers composited into realistic phone-like scenes. The dataset appears designed for computer vision tasks involving document classification within a synthetic environment. Its author, organization, and specific scale are unknown.

ImageMovie PostersComputer VisionMagazine CoversSynthetic Data+1

0 views

Media & Communication

WFS INSPIRE BPL Recompression Stumpenhof: Urban Development Plan for Plochingen

A WFS service provides the urban development plan 'Nachverdichte Stumpenhof - Nördlicher Teil' for the city of Plochingen, Germany. The data is transformed according to the INSPIRE directive and is based on an XPlanung dataset in version 5.4. The dataset is maintained by the Bundesamt für Kartographie und Geodäsie and was last updated on March 30, 2026.

Geospatial🇩🇪 GermanyInspireUrban PlanningXplanung+1

0 views

Media & Communication

Movies Dataset from Kaggle

Kaggle hosts a dataset titled 'movies'. The dataset's specific content, size, and origin are not detailed in the provided metadata. Metadata is minimal; actual content requires verification after download.

TabularMoviesMediaEntertainment+1

0 views

Media & Communication

Movies Dataset from Kaggle

MOVIES is a dataset hosted on the Kaggle platform. Its specific content, size, and origin are not detailed in the provided metadata. The dataset likely contains information related to films, which could include titles, genres, ratings, or cast details.

TabularMoviesFilmEntertainment+1

0 views

Media & Communication

Latin American LLM Benchmark For Factual And Cultural Knowledge

Trueque is a human-reviewed benchmark dataset for evaluating large language models on Latin American knowledge and cultural appropriateness. It is an initial beta release (version 0.1) created by latam-gpt. The dataset was last updated on April 1, 2026.

TextCSVTask Categoriestext GenerationLibrarypolarsTask Categoriesquestion AnsweringCultural AccuracySize Categoriesn1 KModalitytextLibrarymlcroissantEvaluationLibrarydatasetsBenchmarkLatin AmericaLibrarypandasLlm EvaluationArxiv241002677Factual KnowledgeArxiv240609948RegionusLanguageesArxiv251121140Licenseapache 20Arxiv250220936+1

0 views

Media & Communication

IPV Type-Two Polio Seroconversion Rates by Age and Dose Schedule, 1985-2022

Binomial regression model results for IPV-induced type-two polio seroconversion across different ages and dosing schedules. The model was fitted to a review of 19 seroconversion studies conducted between 1985 and 2022. The dataset was authored by Elizabeth J. Gray and published on figshare.

TabularExcel39 Week Schedule14 82116 8211Inactivated Polio Vaccine2 Ipv Seroconversion65 8722Despite Lower ImmunogenicitySeroconversionSerotype 2 PoliomyelitisCountries Using Opv39 Week SchedulesSpecific Schedules UsingEstimating Population ImmunityCombined Published DataEarlier Schedule MayDose Schedule2 Opv Use2025 Among ChildrenVaccine ScheduleOral Poliovirus VaccineImmunogenicityInactivated Poliovirus VaccineDose Introduction DatesPolio+1

0 views

Media & Communication

Multivariable Linear Regression Analysis of Intracompartmental Pressure Factors

A multivariable linear regression analysis investigating factors associated with intracompartmental pressure. The dataset, created by Heng Zhang and shared under a CC-BY-4.0 license, was last updated on March 25, 2026. It is stored in an XLS file with a size of 5.5 KB.

TabularNovel Morphometric SurrogateAcute Compartment SyndromeInvasive Icp ServedCompartment Syndrome395 007 2Orthopedic TraumaNoninvasive Icp EstimationEnergy Fractures621 0Lack Clinical PracticalityTertiary Trauma CenterHigh Compartment PressureMedical ResearchTfa 4Clinical RegressionSignificant Linear CorrelationIntracompartmental PressureMultivariable Linear RegressionNoninvasive AssessmentXlink 916Tibial Plateau FracturesXlink+1

0 views

Media & Communication

Business Metrics by Reviewer Profile Sorted by Quality

A dataset profiling reviewer behavior on online retail platforms, created by Luisa Stracqualursi. It was last updated on March 25, 2026. The data is stored in an XLS file and is 5.5 KB in size.

TabularOnline Retail PlatformsBusiness MetricsReviewer 8217Extreme RatingsTwo Complementary IndicesPotentially AssociatedPractical ValueRating SystemsReviewer Polarity IndexMonitor Reviewer BehaviorReviewers 8217Reviewer BehaviorWorld ContextReviewer Extremeness IndexProfiling Reviewer BehaviorHistorical Extreme BehaviorsReviewer OntoNuanced InterpretationNegative ExtremesOnline RetailScalable ToolProvide Deeper InsightQuality Analysis+1

0 views

Media & Communication

Reviewer Profiling Framework for Online Retail Platforms

A framework for profiling reviewer behavior on online retail platforms, comparing approaches to balance scalability and interpretability. The dataset was authored by Luisa Stracqualursi and last updated on March 25, 2026. It is a 5.5 KB XLS file shared under a CC-BY-4.0 license.

TabularBehavior AnalysisReviewer ProfilingOnline Retail PlatformsReviewer 8217Extreme RatingsTwo Complementary IndicesPotentially AssociatedPractical ValueRating SystemsReviewer Polarity IndexMonitor Reviewer BehaviorReviewers 8217Reviewer BehaviorWorld ContextReviewer Extremeness IndexProfiling Reviewer BehaviorHistorical Extreme BehaviorsReviewer OntoNuanced InterpretationNegative ExtremesOnline RetailScalable ToolProvide Deeper Insight+1

0 views

Media & Communication

<p>Numbers in TAU (n = 46) and CFT (n = 45) conditions in each level of depression categor

Mary Hynes published a dataset on figshare detailing participant counts in a clinical trial for depression. The data includes numbers for TAU (n=46) and CFT (n=45) conditions across three time points: baseline, post-treatment, and three-month follow-up. The dataset is 5.5 KB in size and was last updated in March 2026.

TabularRandomized Controlled TrialClinically Significant ImprovementsSocial ComparisonSpecifically DesignedSubmissive BehaviourTherapy OutcomesBenchmarkPossible PrescriptionMonth Follow91 Participants AllocatedOften AccompaniedSubmissive BehaviorAlleviate Psychological DistressEither 8216Three Time PointsCompassion Focused TherapyUsual 8217Effective Psychological InterventionPsychological OutcomesFollow Up StudyXlinkClinical PsychologyBased Cft InterventionDepression+1

0 views

PreviousPage 296 of 548Next