Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Datasets contain images of Markush chemical structures from patents paired with their CXSMILES string representations. The collection includes over 54,000 training samples from the USPTO-MOL-M source and multiple benchmark subsets for evaluation. The dataset was created by docling-project and was last updated in March 2026.
The full description and details for the IP5-markush subset require visiting the external dataset page. License information is unknown.