Sign in to view source links and access this dataset
Description
BioMed-VITAL Instructions is a dataset for tuning multimodal AI models on biomedical visual tasks with clinician preference alignment. It contains multiple files ranging from 60,000 to 210,000 instruction samples, with file sizes from 127 MB to 463 MB. The dataset was created by authors including Hejie Cui, Lingjun Mao, and Carl Yang, and was last updated on August 17, 2024.
Use Cases
Fine-tuning large language models for biomedical visual question answering based on instruction-response pairs.
Aligning model outputs with clinician preferences for medical image interpretation tasks.
Training multimodal AI assistants for clinical decision support using visual instructions.
Strengths
Dataset includes multiple scales, with sample sizes up to 210,000 instruction pairs.
Explicitly designed for clinician preference alignment, a specific and relevant goal for medical AI.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown for individual files, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.