5 hours of Turkish audio and text transcripts sourced from over 40 Creative Commons-licensed YouTube videos. The collection features more than 100 distinct speakers with audio resampled to 16 kHz and segmented into clips of up to 25 seconds. It is specifically designed for training and evaluating Turkish speech-to-text models.
Use Cases
- Train Turkish speech-to-text (STT) models using the audio clips and their corresponding transcriptions.
- Develop speaker recognition or diarization systems leveraging the 100+ unique voices present in the recordings.
- Fine-tune acoustic models for Turkish language processing using the 16 kHz resampled audio files.
Strengths
- Contains approximately 5 hours of Turkish speech data.
- Features audio from over 100 different speakers to ensure vocal variety.
- Audio files are standardized at a 16 kHz sampling rate.
- Data is segmented into manageable chunks with a maximum duration of 25 seconds.