3,312 scientific publications categorized into six classes, interconnected by a citation network of 4,732 links. Each publication is represented by a 3,703-dimensional binary word vector indicating the presence or absence of specific dictionary terms.
Use Cases
- Train node classification models to predict one of the six publication classes using the binary word vectors
- Perform link prediction tasks to identify missing citations based on the 4,732 existing network edges
- Benchmark graph embedding techniques that utilize both the 3,703-word dictionary features and the citation graph structure
Strengths
- 3,312 nodes classified into 6 distinct scientific categories
- 4,732 citation links forming the network's edge structure
- 3,703 unique words forming the binary feature vector for each publication