Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
VibraVox contains between 10,000 and 100,000 French speech recordings captured using body-conduction transducers. Developed by Cnam-LMSSC and documented in Arxiv 2407.11828, this dataset provides a specialized audio-text corpus for speech processing research. It includes expert-generated and crowdsourced annotations for various audio-centric machine learning tasks.
The dataset is provided in Parquet format. Users should refer to Arxiv paper 2407.11828 for technical specifications regarding the specific transducers used during collection.