Name: Voice Annotation Data V2: 18,632 Audio Samples Across 58 Voice Dimensions
Creator: TTS-AGI
Published: 2026-04-15T12:52:36
Keywords: Text To Speech, Audio Classification, Speech Processing, Audio, Voice Annotation

Description

Voice Annotation Data v2 is a curated dataset of 18,632 audio samples, comprising 9,391 positive and 9,241 negative examples across 58 voice dimensions. The dataset was created by TTS-AGI and was last updated on April 15, 2026. Each dimension includes up to 25 positive examples of audio that fits the category and 25 negative examples confirmed not to fit, using the Gemini 2.0 Flash model.

Use Cases

Training voice attribute classifiers based on the 58 annotated dimensions.
Evaluating the performance of TTS models against human- and AI-annotated positive/negative examples.
Developing data curation pipelines for audio datasets using the confirmed negative example methodology.
Studying voice characteristics and their perception across a multi-dimensional taxonomy.

Strengths

Contains 18,632 total audio samples, providing a substantial collection for model training.
Includes balanced positive and negative examples (9,391 and 9,241) for 58 distinct voice dimensions.
Each dimension includes up to 25 confirmed negative examples, which may improve classification robustness.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: TTS-AGI via Hugging Face
Collection Method: Curated audio samples annotated with positive/negative labels, with negatives confirmed by the Gemini 2.0 Flash model.
Time Range: null
Freshness: Last updated 2026-04-15 12:52:50; freshness should be verified.
Geography: null

License is unknown; terms of use must be verified before application.

Audio Text To Speech Audio Classification Speech Processing Voice Annotation

Voice Annotation Data V2: 18,632 Audio Samples Across 58 Voice Dimensions

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info