Name: TIE_Shorts: Technical Indian English Speech Clips from Lectures
Creator: raianand
Published: 2024-11-01T06:19:26
Keywords: Lectures, Audio, Indian English, Large Scale, Speech Recognition, Technical Education

Description

A derived version of the Technical Indian English (TIE) dataset, which contains approximately 8,000 hours of speech from around 9,800 technical lectures in English. The original content was sourced from the NPTEL platform, with lectures averaging 50 minutes each and delivered by instructors from various regions across India. The dataset was created by author 'raianand' and was last updated on the Hugging Face platform in November 2024.

Use Cases

Train automatic speech recognition (ASR) models based on technical lecture audio.
Develop accent and dialect models for Indian English based on speech from diverse regional instructors.
Create educational tools for technical subjects using segmented lecture audio clips.
Conduct linguistic analysis of technical terminology usage in spoken English.

Strengths

Original dataset contains a large scale of approximately 8,000 hours of speech.
Source material consists of around 9,800 lectures, providing substantial volume.
Lectures cover a wide range of technical subjects, offering domain-specific speech data.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
The specific derivation method and content of the 'shorts' version are not detailed in the provided input.

Provenance

Source: NPTEL platform.
Collection Method: Sourced from technical lecture recordings.
Freshness: Last updated 2024-11-16 07:43:44; freshness should be verified.
Geography: India

License is unknown; users must verify terms of use before downloading.

Audio Lectures Indian English Large Scale Speech Recognition Technical Education

TIE_Shorts: Technical Indian English Speech Clips from Lectures

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info