Asr Dummy: Sample Dataset for SUPERB Speech Benchmark

Name: Asr Dummy: Sample Dataset for SUPERB Speech Benchmark
Creator: Narsil
Published: 2022-03-02T23:29:22
Keywords: Regionus

by NarsilUpdated 1y ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

This placeholder dataset contains a small collection of audio files in .flac format specifically formatted for the Speech processing Universal PERformance Benchmark (SUPERB). It provides a file column to facilitate the development of speech processing pipelines and the extraction of self-supervised learning representations.

Use Cases

Develop preprocessing scripts to transform the file column into a speech array for model training.
Verify the compatibility of lightweight prediction heads with frozen SSL representations.
Debug audio loading and mapping functions within the SUPERB benchmark framework.

Strengths

Audio content is encoded in .flac format to reduce disk space usage.
Features a file column containing the local paths to audio recordings.
Compatible with the SUPERB benchmark toolkit for evaluating shared model performance.
Supports conversion to float32 arrays for model input using the soundfile library.

Regionus

Related Datasets

Quality Score

D38

Description

41

Source

41

Reputation

36

Access

22

Community

22.5K downloads

0 views

Dataset Info

Author: Narsil
Created: Mar 2, 2022
Updated: Aug 14, 2024
Last synced: Apr 29, 2026

Access

22

Community

22.5K downloads

0 views

Dataset Info

Author: Narsil
Created: Mar 2, 2022
Updated: Aug 14, 2024
Last synced: Apr 29, 2026

Asr Dummy: Sample Dataset for SUPERB Speech Benchmark

Description

Use Cases

Strengths

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info