DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

11,013 datasets

LSIP: London Sector Skills Analysis for 2025–26 Refresh

Slide decks and analysis produced for the 2025–26 refresh of London's Local Skills Improvement Plan (LSIP). These resources summarise sector-level evidence shared at stakeholder events and reflect data available at the time. The Greater London Authority created these materials, with additional slide decks added in December 2025.

TextSkillsJobsLondonSector AnalysisEmployment+1

0 views

Media & Communication

Nepali News Articles with Category Labels

Over 2.76 million Nepali news articles scraped from Baahrakhari and other sources, with cleaned category labels. The dataset was created by spandyie and was last updated in February 2026. It is provided in Parquet format compressed with Snappy.

TextParquetMedia AnalysisLicenseotherLibrarypolarsNews ClassificationLibrarydaskSize Categories1 Mn10 MLanguageneModalitytextNewsNepali LanguageLibrarymlcroissantLibrarydatasetsRegionusLarge ScaleTask Categoriestext ClassificationNepaliText Corpus+1

0 views

Media & Communication

Dresden Accessibility Data for Leisure, Tourism, and Cultural Entrances

Dresden's Infoportal Accessibility provides detailed information on public facilities regarding barrier-free access, toilets, technical aids, and special services for people with disabilities. The data is available in three languages (German, Czech, and English) and is published via a WFS service. Most records were collected through an INTERREG-funded project focused on the Bohemian-Saxon border area.

GeospatialMultilingualTourismAccessibilityPublic Facilities+1

0 views

Media & Communication

Manju Bhai e-Gadgets Sales: Product-Level Sales Metrics Review

Manju Bhai e-Gadgets Sales likely contains product-level sales metrics for electronic gadgets. The dataset appears to be a review of sales performance metrics, potentially from an e-commerce platform. Its author, organization, and specific temporal coverage are unknown.

TabularProduct PerformanceE CommerceBusiness AnalyticsSales Metrics+1

0 views

Media & Communication

Trending Movies According to Votes, Up to 2026

Trending movies according to votes, sourced from Kaggle. The dataset likely contains movie titles and associated vote counts or popularity metrics. Metadata is minimal; the specific columns, time range, and data collection method are unknown.

0 views

Media & Communication

Great Smoky Mountains National Park Equestrian Facilities and Points of Interest

National Park Service authoritative data defines the location of physical and cultural features within Great Smoky Mountains National Park. The database holds Federally recognized names, geographic coordinates, and attributes like feature classification and historical information. These data are published by the Department of the Interior and were last updated in March 2026.

GeospatialPublic LandsGreat Smoky Mountains National ParkPoints Of InterestNational ParkFacilitiesEquestrian FacilitiesGrsm+1

0 views

Media & Communication

Tweet Paraphrase Pairs for Semantic Similarity Benchmarking

Paraphrase pairs of tweets for the task of text-to-text semantic similarity classification. It is part of the Massive Text Embedding Benchmark (MTEB) and is intended for evaluating embedding models. The specific row count and column details are not provided in the input.

TextLicenseunknownSize Categoriesn1 KModalitytextTask Idssemantic Similarity ClassificationMtebLanguageengRegionusTask Categoriestext ClassificationMultilingualitymonolingualArxiv250213595Annotations CreatorsderivedArxiv221007316+1

0 views

Media & Communication

Travel Services Share Of Service Exports By Country

World Bank data measures travel services as a percentage of total service exports for national economies. The indicator quantifies the economic contribution of nonresident and resident travel expenditures within the Balance of Payments framework. It is compiled by the World Bank's World Development Indicators team.

TabularTime SeriesTradeServices ExportsTourism EconomicsInternational TradeBalance Of PaymentsEconomy Growth+1

0 views

Media & Communication

Insurance and Financial Services as a Share of Service Exports

Insurance and financial services data measures the share of these services in total service exports for national economies. The dataset is part of the World Bank's World Development Indicators, a collection of global development data. It covers transactions between residents and non-residents for insurance, financial intermediary, and auxiliary services.

TabularTime SeriesPrivate SectorTradeEconomic IndicatorsFinancial ServicesInternational TradeFinanceBalance Of PaymentsEconomy Growth+1

0 views

Media & Communication

Travel Services Share of Service Imports

Travel services account for a percentage of total service imports in national balance of payments. This World Development Indicators metric quantifies the economic weight of non-resident travel expenditures within a country's service import portfolio. The World Bank compiles this data for global economic monitoring.

TabularTime SeriesTradeEconomic IndicatorsTourism EconomicsInternational TradeBalance Of PaymentsEconomy Growth+1

0 views

Media & Communication

Insurance and Financial Services as a Share of Total Service Imports

World Development Indicators data measures insurance and financial services imports as a percentage of total service imports for countries. The dataset quantifies the share of cross-border financial and insurance transactions within a nation's broader service import economy. It is compiled by the World Bank's World Development Indicators team.

TabularTime SeriesPrivate SectorTradeEconomic IndicatorsFinancial ServicesInternational TradeFinanceEconomy Growth+1

0 views

Media & Communication

Khmer Language News Articles for OCR and NLP Research

A collection of Khmer-language news articles scraped from multiple online news websites for academic and research purposes related to Khmer OCR and natural language processing. It was created by Thareah and last updated in February 2026. The dataset consists of extracted textual content without images or structured metadata.

LicenseotherModalitytextRegionus+1

0 views

Media & Communication

Top 30% Active Restaurant Review Whales HI 118291: Restaurant Market Data Sample

A free sample of restaurant market data from BeamStation, focusing on a subset of highly active users. The dataset likely contains information related to restaurant reviews and market performance. Its specific size, features, and collection date are not detailed in the provided metadata.

TabularFood ServiceBusiness IntelligenceRestaurant ReviewsMarket Data+1

0 views

Media & Communication

Top-Rated Movies from TMDB

TMDB Top-Rated Movies Dataset features movie ratings. The dataset is sourced from The Movie Database (TMDB) platform and aggregated on Kaggle. Its last update date and specific size are unknown.

TabularRatingsMoviesTmdb+1

0 views

Media & Communication

Movie Dataset from Kaggle

Movie Dataset is a collection of film-related data published on the Kaggle platform. The dataset's specific contents, such as titles, genres, ratings, or cast information, are not detailed in the available metadata. Its size, structure, and creation details are unknown and require verification after download.

TabularMoviesFilmEntertainment+1

0 views

Media & Communication

Fake Review Dataset Clean V2: Online Product Reviews

Kaggle hosts a dataset titled 'fake-review-dataset-clean-v2'. The dataset likely contains text data related to online product reviews, potentially with labels indicating authenticity. The author, organization, and specific collection details are not provided in the available metadata.

TextTabularOnline ReviewsSentiment AnalysisText ClassificationFake Reviews+1

0 views

Media & Communication

IMDb Movies and TV Shows Dataset, Over 1900 Titles

Kaggle hosts a dataset of over 1900 movies and TV shows sourced from IMDb. The specific collection date, author, and detailed column information are not provided in the available metadata. Its content likely includes titles and associated metadata typical of IMDb listings.

TabularMoviesTv ShowsImdbEntertainment+1

0 views

Media & Communication

TikTok Post Data

TikTok post data is a dataset sourced from the social media platform TikTok. It was published on Kaggle, but the author, organization, and specific collection details are unknown. The dataset's size, row count, and specific column structure are not provided in the available metadata.

TabularUser Generated ContentSocial MediaVideo ContentTiktok+1

0 views

Media & Communication

Indonesian Film Reviews

Ulasan Film Indonesia is a dataset of Indonesian-language film reviews published on HuggingFace by Faisaljabir. The dataset's content likely contains user-generated text about movies. Its last update was recorded on 2026-04-05.

TextFilm ReviewsSentiment Analysis+1

0 views

Media & Communication

Fss Written Artifacts: Synthetic Binary Blobs for Write Latency Testing

Synthetic binary blobs for measuring sequential write throughput latency, released by micmicmicmicmicchan in March 2026. The data is formatted in Parquet without Snappy compression to facilitate infrastructure verification and performance benchmarking.

LicenseopenrailRegionus+1

0 views

PreviousPage 358 of 550Next