Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
En Vdr Hn is a multimodal retrieval training set for fine-tuning visual-document embedding models on English document pages. The dataset, created by whybe-choi and last updated on 2026-04 26, provides query text and page image pairs, with each row containing one positive and seven mined hard negatives. Hard negatives were mined using the Qwen/Qwen3-VL-Embedding-8B model within each source dataset.
License is unknown; restrictions should be verified before use.