DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Genomics & Bioinformatics Datasets | DataSalon

All Categories

🧬

Genomics & Bioinformatics

DNA/RNA sequences, gene expression, protein structures, metagenomics, single-cell sequencing

23,538 datasets

Integrative Analysis of Adiponectin-Related Genes in Melanoma with Immune Subtypes

A genomic dataset integrating SNP and transcriptomic data from the IEU openGWAS and TCGA-SKCM cohorts to explore adiponectin-related genes in melanoma. The study identifies five prognostic SNPs, four molecular subtypes, and 702 differentially expressed genes enriched in immune pathways. The dataset was authored by Yong Yin and last updated in May 2026.

TabularGene ExpressionMelanoma GenomicsHealthcareImmune SubtypesPrognostic BiomarkersCancer Bioinformatics+1

0 views

Genomics & Bioinformatics

Public Assistance Caseloads and Expenditures: New York State Monthly Data

Public Assistance (PA) Caseloads and Expenditures: Beginning 2002 provides monthly listings of cases, recipients, adults, children, and expenditures for New York State's Family Assistance and Safety Net Assistance programs. The dataset likely contains detailed breakdowns of federally participating and non-participating components, as well as maintenance-of-effort distinctions. It is hosted by the State of New York and appears on multiple data platforms.

TabularTime SeriesCSVXMLJSONSocial WelfareGovernment ExpenditurePublic AssistanceWelfareSocial Services+1

0 views

Genomics & Bioinformatics

Ndengereko Vocabulary List with English and Swahili Translations

A 399.4 KB spreadsheet of Ndengereko vocabulary collected during fieldwork. Marie-Annick Moreau compiled this working list, which includes English and Swahili translations. Spellings are as transcribed by KST and require further verification, indicating the data is a preliminary research artifact.

TabularTranslationLanguage DocumentationNdengerekoVocabularyLinguistics+1

0 views

Genomics & Bioinformatics

National Navy Community Support Events with Geographic and Temporal Details

Records of National Navy community support and development events across various regions. The dataset likely contains details on event types, descriptions, participating military units, and locations. It is provided by datos.gov.co and was last updated on 2026-05-18.

TabularGeospatialCSVXMLJSONGeospatial EventsPublic Service RecordsMilitary Civil EngagementCommunity Development+1

0 views

Genomics & Bioinformatics

Mpendu's Explanation of Joking and Work Songs from Fishing Communities

Mauridi Omari Mpendu explains the context and cultural specificity of work songs and joking songs sung by fishermen. The 30.6 KB PDF document, authored by Marie-Annick Moreau and last updated in June 2026, records this oral history. It highlights songs considered unsuitable for European audiences, focusing on their role in maintaining energy and humor.

TextAudioOral TraditionOral HistoryAudio ArchivesEthnomusicologyWork SongsCULTURAL STUDIES+1

0 views

Genomics & Bioinformatics

Joking and Work Songs from Fishing Communities, Audio Recordings

Audio recordings of joking songs and work songs from a fishing community, as explained by Mauridi Omari Mpendu. The dataset includes 8.9 MB of WAV files, created by Marie-Annick Moreau and last updated on June 3, 2026. The songs are described as unsuitable for a European context, highlighting their specific cultural context.

AudioOral TraditionOral HistoryAudio ArchivesEthnomusicologyWork SongsJoking SongsCULTURAL STUDIES+1

0 views

Genomics & Bioinformatics

Machine Learning Predictions for Single-Cell Protein Expression

624.8 KB of correlation values comparing machine learning predictions of protein expression across datasets. The data, published by Josephine Fisher under a CC-BY-4.0 license, was last updated on June 4, 2026. It serves as supplementary material for a study on using predictions as a proxy for single-cell protein measurements.

TabularCSVMachine LearningProtein ExpressionSingle CellBioinformaticsPrediction Correlation+1

0 views

Genomics & Bioinformatics

Protein Expression Prediction Correlation Values for Four RNA-Based Methods

Additional file 2 contains Table S1, a dataset of prediction correlation values for four methods of approximating protein expression from RNA data. The dataset was authored by Josephine Fisher and last updated on June 4, 2026. It is a small CSV file of 11.7 KB, published under a CC-BY-4.0 license on figshare.

TabularCSVMachine LearningProtein ExpressionBioinformaticsRna Correlation+1

0 views

Genomics & Bioinformatics

PoLoCo Supporting Dataset: Draft Genome Assembly and Pool-seq Files for Entomobrya nivalis

1.0 GB of supporting data for the PoLoCo workflow manuscript includes the final draft genome assembly of Entomobrya nivalis, validation outputs, and Pool-seq files for reference comparison. The dataset contains files in TXT, CSV, TSV, FA, PNG, and PDF formats, generated by author Mohammad Jamil Shuvo. It was last updated on 2026-04-14.

TextTabularCSVTSVGenome AssemblyComputer VisionSNP discoveryPool-SeqBioinformatics Workflow+1

0 views

Genomics & Bioinformatics

Maternal Serum Ferritin and SGA Risk from Shanghai Cohort, 2018-2020

A longitudinal cohort study of 17,451 pregnant women delivering at Shanghai First Maternity and Infant Hospital between 2018 and 2020. It investigates the association between maternal serum ferritin levels measured at two gestational windows and the risk of small-for-gestational-age births. The dataset was authored by Nana Guo and is available under a CC-BY-4.0 license.

TabularPregnancyMaternal HealthSerum FerritinHealthcareSmall For Gestational AgeClinical Study+1

0 views

Genomics & Bioinformatics

Morpho-Physiological Responses of Ligustrum Species to Drought and Salt Stress

Giulia Daniele's dataset documents a controlled experiment evaluating drought and salt stress responses in three Ligustrum species. The study involved 270 plants grown under controlled conditions from March 2024 to June 2024, subjected to varying NaCl concentrations and irrigation levels. It measures impacts on leaf area, digital biomass, chlorophyll content, and dry biomass of shoots and roots.

TabularSalt ToleranceUrban GreeningDrought ResponsePlant StressOrnamental Species+1

0 views

Genomics & Bioinformatics

IN2023_V07: Antarctic Circumpolar Current Acoustic Doppler Profiles, Nov-Dec 2023

Acoustic Doppler Current Profiler (ADCP) data from the RV Investigator voyage IN2023_V07, focusing on smaller scales of the Antarctic Circumpolar Current south of Tasmania. The voyage took place between November 15 and December 20, 2023, departing from and returning to Hobart. Data were collected using the University of Hawaii Data Acquisition System (UHDAS) and post-processed by CSIRO's National Collections and Marine Infrastructure Information and Data Centre.

AudioTime SeriesGeospatialOceanographyCurrent ProfilesAcoustic DopplerMarine ResearchAntarctic Circumpolar Current+1

0 views

Genomics & Bioinformatics

EU Demographic and External Imbalance Projections Through 2050

Replication Data for 'Demographic Persistence and Compositional Drift in EU External Imbalances Through 2050' by Brian Peters (2026). The dataset likely contains a processed panel for analyzing demographic trajectories, compositional drift, and pension system impacts on EU27 current account balances from 2024 to 2050. It uses Eurostat nonfinancial sectoral accounts and UN World Population Prospects demographic projections.

TabularEu EconomySectoral AccountsExternal ImbalancesPension SystemsDemographics+1

0 views

Genomics & Bioinformatics

TCOF1 Gene Expression and Proteomics Data in Renal Cancer Cell Lines

Supplementary data from a study on the TCOF1 gene's role in renal cancer angiogenesis. The dataset includes qPCR primer sequences, statistical analysis of protein expression changes across 169 normal kidney samples and 219 ccRCC tumor samples, microarray results from 3 biological experiments, and proteomics results from 4 biological experiments. Data was deposited in NCBI GEO (GSE299580) and ProteomeXchange (PXD027601) by author Małgorzata Grzanka.

TabularTime SeriesZIPGene ExpressionRenal CancerMicroarray AnalysisCell CultureProteomics+1

0 views

Genomics & Bioinformatics

Risk Transcription Factors for Breast and Prostate Cancer with Prioritized Variants

Qing Li's dataset lists risk transcription factors for breast and prostate cancer, derived from a transfer learning framework applied to the Enformer model. The dataset includes tissue-specific cis-regulatory activity scores for millions of single-nucleotide variants prioritized from GWAS datasets. It was last updated on 2026-05-06 and is shared under a CC-BY-4.0 license.

TabularExcelCancer genomicsHealthcareRegulatory VariantsLarge ScalePrecision Medicine+1

0 views

Genomics & Bioinformatics

Circular RNAs in Brassica Napus Phloem Sap with 1,734 Identified circRNAs

Kim Lara Lühmann's dataset characterizes circular RNA content in the phloem sap of the crop plant Brassica napus. The analysis identified 1,734 distinct circRNAs, with ten validated by PCR and Sanger sequencing, and investigated full-length sequences and potential miRNA interactions. The data was last updated on 2026-04-29.

TabularExcelPhloem SapRna SequencingPlant BiologyBrassica NapusCircular Rna+1

0 views

Genomics & Bioinformatics

Eye Movement and Spatial Localization Data for Individual and Ensemble Visual Tasks

Experimental data from a study on individual and ensemble visual perception in naturalistic contexts. The dataset, authored by Yanina E. Tena Garcia and last updated in May 2026, contains performance metrics and eye movement recordings from participants tasked with locating single objects or their average position under varying scene contexts and presentation times.

TabularExcelPsychologyEye TrackingVisual Perception+1

0 views

Genomics & Bioinformatics

Eye Movement and Spatial Localization Data for Individual and Ensemble Object Perception

5.5 KB of experimental data from a study on how scene context and presentation time affect visual perception. The dataset, authored by Yanina E. Tena Garcia and last updated in May 2026, contains mouseclick and eye movement measurements from participants tasked with locating single objects or their average ensemble position. Results indicate accuracy and saccade patterns differ significantly between naturalistic and texturized backgrounds.

TabularExcelPsychologyEye TrackingVisual Perception+1

0 views

Genomics & Bioinformatics

Analysis-Ready Clinical Data for Platelet Decline and Cox Regression

Excel version of SPSS data for Tables 1–4.xls is a 133.0 KB dataset prepared by Jianchun Wei for replicating statistical analyses from a published PLOS ONE study. The file contains analysis-ready data formatted for SPSS to generate descriptive statistics and Cox regression models. It was last updated on 2026-05-06.

TabularExcelPlatelet AnalysisSpss ReadyCox RegressionClinical Data+1

0 views

Genomics & Bioinformatics

HTLV-1 Antigenicity Data for mRNA Vaccine Design

5.5 KB of data on the antigenicity of Human T-lymphotropic virus type 1 (HTLV-1) proteins, used in an AI-driven reverse vaccinology study. The dataset, authored by Nadia Seifi and last updated in May 2026, contains results from immunoinformatics tools analyzing the two most antigenic viral proteins to identify T- and B-cell epitopes for a candidate mRNA vaccine.

TabularExcelRetrovirusHtlv 1Vaccine DesignImmunoinformaticsAntigenicitySynthetic+1

0 views

PreviousPage 197 of 1177Next