Sign in to view source links and access this dataset
Description
RuleDepData provides six benchmark knowledge graph datasets used by the RuleDep-ICDE2027 research project. The collection includes KG20C, WN18RR, codex-m, FB15k-237, codex-l, and YAGO3-10, each containing original train/valid/test splits and preprocessed files. The dataset was authored by 'yesun' and last updated on Hugging Face in June 2026.
Use Cases
Benchmarking link prediction models based on the provided train/valid/test splits.
Developing rule-based dependency learning algorithms based on the preprocessed graph artifacts.
Evaluating knowledge graph completion methods across diverse domains like biomedical (KG20C), lexical (WN18RR), and general (FB15k-237, YAGO3-10).
Reproducing experiments from the RuleDep-ICDE2027 paper based on the included data artifacts.
Strengths
Includes six established benchmark datasets (KG20C, WN18RR, codex-m, FB15k-237, codex-l, YAGO3-10) commonly used in knowledge graph research.
Provides both original splits and preprocessed files generated by a specific analysis script, ensuring reproducibility for the associated paper.
Last updated on 2026-06-15, indicating recent maintenance on the platform.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count and total size are unknown, which may limit suitability assessment for large-scale experiments.
Hetionet, a dataset used in the paper, is not included due to size, requiring users to regenerate it from raw sources.
Provenance
Source
Hugging Face dataset repository by author 'yesun'.
Collection Method
Compiled as data artifacts for the RuleDep-ICDE2027 research project.
Freshness
Last updated 2026-06-15 09:56:00; freshness should be verified.
License is unknown; users should verify terms before use. Hetionet dataset must be regenerated separately using scripts from the RuleDep repository.