EDU-CHEMC is a dataset of handwritten chemical structure images paired with structure-specific markup annotations. The dataset was released by author ConstantHao and was initially proposed in a paper titled 'Handwritten Chemical Structure Image to structure-Specific Markup Using Random Conditional Guided Decoder'. It was last updated on Hugging Face on April 10,我们发现2026.
Use Cases
- Training models for handwritten chemical structure recognition based on the provided images.
- Developing systems to convert chemical images to markup strings based on the annotated 'chemfig' data.
- Evaluating image-to-markup translation algorithms based on the paired image and JSON annotation structure.
Strengths
- Each image is paired with a corresponding JSON annotation file, ensuring direct alignment.
- Annotations include a human-verified 'chemfig' string that can be rendered with LaTeX.
- The dataset was created for and referenced in a specific, peer-reviewed academic paper.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- Hugging Face dataset uploaded by ConstantHao.
- Collection Method
- Images and annotations were created for the research paper cited in the description.
- Time Range
- null
- Freshness
- Last updated 2026-04-10 14:14:08; freshness should be verified.
- Geography
- null