Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
AudioX-IFcaps contains over 7 million audio samples with instruction-following captions, developed by HKUSTAudio for ICLR 2026. The dataset provides structured annotations for audio and music generation, focusing on sound event categories, counts, and temporal ordering.
Distributed in webdataset format; users must comply with the CC BY-NC-ND 4.0 license which restricts commercial use and derivatives.