Name: Mobile-O Pre-Train: 9 Million Text-Image Pairs for Cross-Modal Alignment
Creator: Amshaker
Published: 2026-02-06T20:14:20
Keywords: Text Image Pairs, Task Categoriesimage To Text, Librarywebdataset, Size Categories10 Mn100 M, Task Categoriestext To Image, Modalitytext, Librarymlcroissant, Modalityimage, WEBDATASET, Librarydatasets, Pretraining, On Device Ai, Computer Vision, Arxiv260220161, Licensecc By Nc 40, Cross Modal Alignment, Regionus, Large Scale, Multimodal, Mobile O

Description

Amshaker's dataset provides 9 million text-image pairs for the first-stage pre-training of the Mobile-O multimodal model. The data is intended to align a diffusion decoder and conditioning projector with a frozen vision-language backbone. The dataset was last updated on Hugging Face in February 2026.

Use Cases

Pre-training vision-language models for on-device use based on the described cross-modal alignment task.
Aligning diffusion decoders with text encoders using large-scale image-text pairs mentioned in the description.
Training conditioning projectors for multimodal generation tasks referenced in the dataset's purpose.

Strengths

Contains 9 million text-image pairs for large-scale training.
Designed for a specific, documented pre-training stage (Stage 1: Cross-Modal Alignment).

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Hugging Face user Amshaker.
Collection Method: Likely aggregated from multiple sources for pre-training, as suggested by the platform tags.
Time Range: null
Freshness: Last updated 2026-02-24 06:16:41; freshness should be verified.
Geography: null

License is listed as 'cc-by-nc-40' on the platform, indicating a Creative Commons Attribution-NonCommercial 4.0 license.

Mobile-O Pre-Train: 9 Million Text-Image Pairs for Cross-Modal Alignment

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info