DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

MF: Male/Female Phoneme-Aligned Speech Corpus | DataSalon

Home Machine LearningMF: Male/Female Phoneme-Aligned Speech Corpus

Machine Learning

MF: Male/Female Phoneme-Aligned Speech Corpus

Name: MF: Male/Female Phoneme-Aligned Speech Corpus
Creator: falabrasil
Published: 2026-06-15T13:06:57
Keywords: Ground Truth, Speech Corpus, Speech Processing, Audio, Phonetic Alignment

by falabrasil·Updated 17d ago

Available on 1 platform

Description

A two-speaker dataset manually aligned at the phoneme level, providing ground truth for phonetic alignment research. It contains 200 instances from both a male and a female speaker. The dataset was created by falabrasil and has been used as ground truth in multiple academic papers from 2016 to 2022.

Use Cases

Training forced phonetic alignment models based on manually aligned phoneme-level ground truth.
Evaluating the performance of speech alignment algorithms on both male and female speakers.
Studying speaker-specific phonetic variations using the provided male and female speaker data.

Strengths

Contains 200 phoneme-aligned instances from each of a male and a female speaker.
Manually aligned at the phoneme level, providing high-quality ground truth.
Has been used as ground truth in at least five published academic papers.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.

Provenance

Source: falabrasil on Hugging Face
Collection Method: Manually aligned at the phoneme level.
Freshness: Last updated 2026-06-15 13:24:52; freshness should be verified.

Audio Ground Truth Speech Corpus Speech Processing Phonetic Alignment

Related Datasets

Quality Score

D37

Description

Source

Reputation

Quality Score

D37

Description

Source

Reputation

Access

Community

68 downloads

1 likes

0 views

Dataset Info

Author: falabrasil
Created: Jun 15, 2026
Updated: Jun 15, 2026
Last synced: Jul 2, 2026

Access

Community

68 downloads

1 likes

0 views

Dataset Info

Author: falabrasil
Created: Jun 15, 2026
Updated: Jun 15, 2026
Last synced: Jul 2, 2026

MF: Male/Female Phoneme-Aligned Speech Corpus

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info