64,727 one-second .wav audio files containing 30 to 35 distinct spoken English words and background noise. The collection includes ten core directional and action commands alongside auxiliary words and a dedicated _silence_ class for noise simulation.
Use Cases
- Train a keyword spotting model to recognize specific directional inputs using the core command labels and the audio data
- Develop an out-of-vocabulary detection system by leveraging the is_unknown feature to classify non-command words
- Build a noise-resilient speech classifier by incorporating the _silence_ class recordings into the training pipeline
Strengths
- 64,727 audio files in .wav format, each exactly one second long
- Includes 10 core command labels such as 'Yes', 'No', 'Up', 'Down', 'Left', 'Right', 'On', 'Off', 'Stop', and 'Go'
- Features an is_unknown boolean flag to differentiate between primary commands and auxiliary words like 'Bed', 'Bird', or 'Marvin'
- Contains a _silence_ class consisting of environmental recordings and mathematical noise simulations