DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Media & Communication Datasets | DataSalon

All Categories

📺

Media & Communication

News corpora, social media analysis, movie/music metadata, sports data, cultural datasets, misinformation

10,908 datasets

Unihand Preview: A Subset of Pretraining Data for Being-H0.5

A subset of the pretraining data for the Being-H0.5 model, which focuses on scaling human-centric robot learning for cross-embodiment generalization. The dataset was created by the author group BeingBeyond and was last updated on the platform in April 2026. The full description and details are available on the original dataset page.

TextMultimodalWEBDATASETSize Categories10 Kn100 KLibrarywebdatasetCross EmbodimentModalitytextUs RegionLibrarymlcroissantRobot LearningLibrarydatasetsRegionusHuman CentricMlcroissantPretraining Data+1

0 views

Media & Communication

Dz-Emotion: 6,000 Algerian Arabic Social Media Comments for Emotion Detection

6,000 manually annotated social media comments collected from YouTube, Facebook, and Instagram. The dataset, Dz-Emotion, is the first large-scale resource for emotion detection in the Algerian Arabic dialect (Darija), labeled according to Ekman's six basic emotions. It was created by Houdna-khilouf and last updated on HuggingFace in April 2026.

TextMultilingualEmotion DetectionSocial Media TextSocial MediaLarge ScaleNatural Language ProcessingMultilingual NlpAlgerian Arabic+1

0 views

Media & Communication

MARS EXPRESS MRS: Radio Science Occultation Data from 2015

Mars Express Radio Science data collected during the extended mission phase from 2015-01-01 to 2016-12-31. It is an occultation measurement covering a specific observation window on 2015-08-30. The dataset originates from the National Aeronautics and Space Administration (NASA).

Time SeriesRadio ScienceOccultation MeasurementSpace MissionsPlanetary ScienceMars ExpressMars+1

0 views

Media & Communication

Historical XBT Temperature Profiles from Three Vessels in the Bering Sea

Three vessels collected temperature-depth profiles in the Bering Sea and North Pacific Ocean from January 30 to March 3, 1982. The National Oceanographic Data Center processed the data into the standard C116/C118 format, which records temperature at non-uniform inflection points to define the profile curve. This dataset represents a specific historical snapshot of ocean conditions.

TabularTime SeriesNorth PacificOceanographyBering Sea+1

0 views

Media & Communication

Gulf of Mexico 1973: High-Resolution Oceanographic Profiles

February 2 to March 10, 1973, oceanographic data was collected from R/V J.M. Gilliss and R/V C. Iselin cruises I7304 and S7303 in the Gulf of Mexico. The dataset contains high-resolution CTD/STD profiles processed to the NODC F022 standard, likely reporting temperature, salinity, density, and possibly dissolved oxygen at depth intervals as fine as 1 meter. Cruise information, position, date, time, and environmental conditions are reported for each station.

Time SeriesGeospatialOceanographyCtd StdContinental ShelfGulf Of MexicoPhysical Oceanography+1

0 views

Media & Communication

NCEI 0161868: Surface Ocean CO2 Measurements from M/V Equinox

NCEI Accession 0161868 contains surface underway measurements of carbon dioxide and related oceanographic variables. Data were collected aboard the SOOP M/V Equinox in the Caribbean Sea, Gulf of Mexico, and North Atlantic Ocean from January 2, 2017, to January 6, 2018. The dataset includes mole fractions of CO2 in air and seawater, sea surface temperature, salinity, and calculated fugacity values.

TabularTime SeriesSurface UnderwayEqnx 20170414Eqnx 20170515OceanographyEqnx 20170604Eqnx 20170526Atlantic OceanEqnx 20170611Carbon dioxideNorth AtlanticEqnx 20170313Eqnx 20170120Eqnx 20170403Eqnx 20170210Eqnx 20170505Eqnx 20170424Eqnx 20170109Gulf Of MexicoEqnx 20170130Eqnx 20170303Eqnx 20170102Caribbean SeaOcean CarbonEqnx 20170618Eqnx 20170220Eqnx 20170324+1

0 views

Media & Communication

Fitzroy River Palaeochannel Bathymetry on the Southern Great Barrier Reef

Geoscience Australia Data reports the bathymetric expression of the Fitzroy River palaeochannel on the continental shelf of the southern Great Barrier Reef. The dataset, last updated on 2026-03 25, provides data on a major sediment transport pathway that differs from the previously discovered Burdekin palaeochannel by not being buried. It offers insights into the response of major rivers to sea level change in a mixed siliciclastic-carbonate sedimentary province.

Geospatial🇦🇺 AustraliaSediment TransportGreat Barrier ReefPalaeochannelBathymetry+1

0 views

Media & Communication

Plasmodium falciparum Differentially Expressed Genes at Ring and Schizont Stages

Differentially expressed genes in Plasmodium falciparum 3D7 parasites at the ring and schizont stages. The dataset is a 27.7 KB Excel file authored by Jing Wu and last updated in April 2026.

TabularExcelPlasmodium falciparumDifferential ExpressionMalaria ResearchParasite Genomics+1

0 views

Media & Communication

Differentially Expressed Genes in Plasmodium falciparum RALP1-Knockdown Parasites

RALP1-knockdown parasites at ring and schizont stages show differential gene expression, essential for schizont maturation and erythrocyte invasion. The dataset, authored by Jing Wu, is a 45.3 KB Excel file containing Table S3 from the related study. It was last updated on figshare in April 2026.

TabularExcelGene ExpressionParasite BiologyDifferential ExpressionMalaria Research+1

0 views

Media & Communication

Rat Brainstem c-FOS Expression Data for Migraine Pathophysiology

A 14.4 KB dataset quantifying c-Fos expression in the trigeminal nucleus of rat brainstems to validate nociceptive activation in migraine-like states. The data was authored by Pelin Kocdor and last updated on April 9, 2026. It is shared under a CC-BY-4.0 license on figshare.

TabularRat BrainstemMigraine ResearchC Fos ExpressionNeuroscience+1

0 views

Media & Communication

VdPBP1 Protein Interactions and Terbinafine Resistance Data

49.8 KB of tabular data in XLSX format from a study on the cytochrome b5-like protein VdPBP1 in Verticillium dahliae. The dataset supports findings on how VdPBP1 mediates electron transfer in ergosterol biosynthesis to confer resistance to the fungicide terbinafine. It was authored by Huan Li and last updated in March 2026.

TabularExcelVerticillium DahliaeFungal BiologyErgosterol BiosynthesisProtein InteractionTerbinafine Resistance+1

0 views

Media & Communication

North Atlantic Ocean CO2 Surface Measurements from 29 Cruises, 2008-2009

29 cruise data sets collected from the SOOP M/V Nuka Arctica lines in the North Atlantic Ocean from 2008-01-08 to 2009-01-07. The data include measurements of mole fraction of CO2 in the equilibrator headspace, barometric pressure, sea surface temperature, and fugacity of CO2 in sea water. These data were collected by researchers from the University of Bergen, Bjerknes Centre for Climate Research, and the University of Gothenburg.

TabularTime SeriesSurface Underway26na2008081326na20080108OceanographyClimate Research26na2008021826na2008080126na2008011726na20080522Carbon dioxide26na20080702North Atlantic26na2008050126na2008053126na2008022926na2008051326na2008072426na2008012826na2008061226na2008071126na2008090326na2008020726na2008091326na2008031026na20080823+1

0 views

Media & Communication

High-Pressure Cerium Hydride X-ray Diffraction Maps for Phase Identification

A dataset of spatially resolved X-ray diffraction patterns from a high-pressure cerium hydride sample, produced using 4th generation synchrotron facilities. The data, uploaded by Lucas H. Francisco in March 2026, is 99.2 MB in size and is intended to challenge traditional analysis methods for identifying elusive crystal phases in colossal datasets.

TabularZIPSynthesis Inhomogeneities AmongProduce Colossal DatasetsUnsupervised Clustering AlgorithmsUnsupervised ClusteringInformed Similarity MeasureDiffraction Spatial MapProminent Superconductor ClassArt Materials TechniquesDirect Human InspectionX Ray DiffractionHighly Challenging EvenHigh Pressure PhasesReflections Unfeasibly ChallengingCrystal StructureAnalysis Approach BasedCurves Whose FeaturesPressure Elusive PhasesPressure Cerium HydrideMap ResolutionCrystal Phases WithinRay Mapping EnhancedStructural PhasesMaterials ScienceSynchrotron DataMaterial Distribution Across+1

0 views

Media & Communication

High-Pressure Cerium Hydride X-ray Diffraction Maps for Phase Identification

9.4 MB of spatially resolved X-ray diffraction data from a high-pressure cerium hydride sample, produced using 4th generation synchrotron facilities. Lucas H. Francisco published this dataset on figshare in March 2026 to demonstrate a physics-informed clustering method for identifying elusive structural phases. The analysis framework is designed for colossal datasets where traditional methods and direct human inspection become unfeasible.

0 views

Media & Communication

Rodent Behavioral Data from Photochemical Nanomotor Anxiety Study

Bin Chen's experimental data supporting findings on photochemical nanomotors reversing anxiety- and depressive-related behaviors in rodents. The 12.6 MB dataset is available in XLSX format and was last updated on April 9, 2026. Source data are provided with the associated paper.

TabularExcelRodent BehaviorNeuroscienceExperimental DataNanomotor Photochemistry+1

0 views

Media & Communication

OpenVerification1: Large-Scale Dataset for LLM Output Verification

ReexpressAI created OpenVerification1, the first large-scale, open-source dataset for research on LLM output verification and uncertainty quantification. The dataset, last updated on 2026-04-25, is designed for binary classification of whether a model's response correctly addresses a given prompt or question.

TextBinary ClassificationLlm VerificationLarge ScaleInstruction Following+1

0 views

Media & Communication

PII Masking Health Data Preview

AI4Privacy's PII-Masking-2M European release provides a preview of 50 sample entries from a dataset for masking Personal Health Information. The full dataset contains 200,000 entries, focusing on European coverage. This preview was last updated in April 2026.

TextEuropean DataHealthcareText MaskingPersonal Health InformationAnonymizationData Privacy+1

0 views

Media & Communication

PII Masking Health Data Preview

TextEuropean DataHealthcareText MaskingPersonal Health InformationAnonymizationData Privacy+1

0 views

Media & Communication

Database from the manuscript entitled "Psychological Uses of Artificial Intelligence in Ad

Database from a manuscript on psychological uses of AI in adolescence. The dataset likely contains survey responses for scale development and cross-cultural invariance testing. It was authored by GALINDO DOMINGUEZ and hosted on Harvard Dataverse, with a last update recorded on 2026-05-21.

TabularScale DevelopmentPsychologyArtificial IntelligenceAdolescenceCross CulturalSynthetic+1

0 views

Media & Communication

European Work Information with Redacted Personal Identifiers

50 sample entries provide a preview of the PII-Masking-2M corpus, focusing on European work and HR information where personally identifiable information has been redacted. The full dataset contains 200,000 records, created by AI4Privacy and released in April 2026. This preview demonstrates the data structure and label distribution without exposing the original sensitive text.

TextEuropean DataPrivacy Preserving AiNatural Language ProcessingSynthetic Data+1

0 views

PreviousPage 234 of 545Next