DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

FormulaSpeech Datasets for Scientific Formula Verbalization | DataSalon

Home Speech & AudioFormulaSpeech Datasets for Scientific Formula Verbalization

Speech & Audio

FormulaSpeech Datasets for Scientific Formula Verbalization

Name: FormulaSpeech Datasets for Scientific Formula Verbalization
Creator: Stephen-Lee
Published: 2026-05-20T14:11:51
Keywords: Speech Synthesis, Ai Tutors, Scientific Formulas, Computer Vision, Text, Audio, Accessible Learning

by Stephen-Lee·Updated 2mo ago

Available on 1 platform

Description

FormulaSpeech Datasets are designed to improve the verbalization of scientific formulas by large speech language models. The datasets support accessible learning scenarios, particularly for blind or low-vision learners relying on speech-enabled AI tutors. The repository is maintained by Stephen-Lee and was last updated on May 21, 2026.

Use Cases

Training speech models to accurately read scientific formulas based on the dataset's verbalization examples
Evaluating model performance on formula reading tasks for accessible learning applications
Developing AI tutors that can assist blind or low-vision learners with STEM content
Benchmarking improvements in end-to-end large speech language models (LSLMs)

Strengths

Dataset is officially provided for the Formula-Speech framework
Focuses on a specific application: scientific formula verbalization for accessible learning

Limitations

Description metadata is limited; actual data quality requires manual inspection after download
Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment

Provenance

Source: Stephen-Lee on Hugging Face
Freshness: Last updated 2026-05-21 15:27:33; freshness should be verified

Text Audio Speech Synthesis Ai Tutors Scientific Formulas Computer Vision Accessible Learning

Related Datasets

Quality Score

D37

Description

Source

Reputation

Quality Score

D37

Description

Source

Reputation

Access

Community

57 downloads

1 likes

0 views

Dataset Info

Author: Stephen-Lee
Created: May 20, 2026
Updated: May 21, 2026
Last synced: May 28, 2026

Access

Community

57 downloads

1 likes

0 views

Dataset Info

Author: Stephen-Lee
Created: May 20, 2026
Updated: May 21, 2026
Last synced: May 28, 2026

FormulaSpeech Datasets for Scientific Formula Verbalization

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info