Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
EMOVA-Alignment-7M is a dataset curated for omni-modal pre-training, including vision-language and speech-language alignment. It was created by Emova-ollm using open-sourced image-text pre-training datasets, OCR datasets, and 2,000 hours of ASR and TTS data. The dataset page was last updated on 2025-03-14.
License is unknown; terms of use must be verified before application.