Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
SAIR provides 1,048,857 unique protein-ligand pairs and 5.2 million 3D structures curated from ChEMBL for drug discovery research. Created by SandboxAQ in collaboration with Nvidia and updated in August 2025, it pairs binding potency measurements with structural data.
Data is provided in Parquet format under a CC BY 4.0 license; requires specialized bioinformatics tools to handle 3D molecular coordinates.