Cross-Medicine Knowledge Graph for Type 2 Diabetes Drug Discovery
by Zekun Zhou·Updated 23d ago
5.5 KB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
A biomedical knowledge graph constructed by Zekun Zhou, integrating multi-source data from Hetionet, SymMap, TCMBank, STRING, and TTD. It contains 15 entity types (245,235 entities) and 52 relation types (7,155,373 triples), covering 709 core Type 2 Diabetes Mellitus genes. The dataset was last updated on May 13, 2026.
Use Cases
Prioritizing drug candidates based on graph embedding link predictions.
Explaining drug mechanisms via a unified path scoring framework incorporating rule-based reasoning.
Analyzing multi-target disease mechanisms such as insulin signaling sensitization and inflammatory regulation.
Strengths
Integrates data from five distinct biomedical sources (Hetionet, SymMap, TCMBank, STRING, TTD).
Contains 245,235 entities and 7,155,373 triples, providing a substantial graph structure.
The ComplEx graph embedding model achieved a reported Hits@10 score of 0.48 for link prediction.
Limitations
The dataset is very small at 5.5 KB, suggesting it may contain only metadata or a summary, not the full graph.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Integrated from Hetionet, SymMap, TCMBank, STRING, and TTD.
Collection Method
Entity alignment and relation consolidation using Jaccard and overlap-based fusion strategies.
Freshness
Last updated 2026-05-13 17:33:16; freshness should be verified.
The primary file format is XLS (Excel), which may not be the optimal format for large graph data; specialized tools may be required for analysis.