Loading...
Loading...
Social network graphs, knowledge graphs, citation networks, molecular graphs for GNN, web link graphs
322 datasets
Open-sources based data on low-level agents Russia has used for sabotage and influence operations in Europe since the 2022 invasion of Ukraine. The dataset includes information on agent characteristics and insights from social network analysis, with all information anonymized. It was authored by Bart Schuurman and last updated on 2026-04 14.
Code files replicate the study 'Coordinated information attacks in social networks: The optimal number of bots and protective power of ranking algorithms'. The data was authored by Ivan Kozitsin and hosted on Harvard Dataverse, with a last recorded update in April 2026.
Analysis code for habitat heterogeneity, complex network topology, and community detection, authored by Jun Ma. The 3.0 MB repository includes R, Python, and Jupyter Notebook files. It was last updated on March 17, 2026.
Wikitolica Knowledge Graph is a structured network of entities and concepts from the Catholic encyclopedia Wikitolica, provided in JSON-LD format. The dataset is designed for dynamic access to ensure data freshness and is managed for digital preservation. Its primary use is for tasks involving structured knowledge representation and semantic relationships within a religious domain.
Logbook records from the New Hampshire Department of Fish and Game's participation in the American Lobster managed species monitoring program. The data supports management under Amendment 3 to the Fishery Management Plan and related addenda, involving industry participation through Lobster Conservation Management Teams. The dataset is provided by the New Hampshire Fish and Game Department and the Atlantic States Marine Fisheries Commission.
Samuel and Audrey Media Network Dataset Directory is a structured public directory for datasets from the Samuel & Audrey Media Network. The dataset's size, format, and specific contents are unknown. It is hosted on Kaggle.
Graph-NuCLS is a node-level classification dataset derived from the NuCLS dataset. Each tissue patch is converted into a cell-graph where nodes represent detected cell nuclei and edges encode spatial proximity. The dataset was authored by ogutsevda and last updated on 2026-03-03.
Software Heritage is the largest existing public archive of software source code and development history. The dataset is a fully deduplicated Merkle DAG representation linking file content, directories, commits, and repository states from major forges, distributions, and package managers. Author and committer information is anonymized.
Justin Goldston published this digital humanities resource in March 2026 to map relationships between biblical passages and thematic classifications. It contains a machine-readable scripture corpus alongside a cross-reference network graph designed for computational theology and knowledge graph construction.
Lampung Selatan Road Network Dataset is a graph representation of the road network in the South Lampung region of Indonesia, derived from OpenStreetMap. The dataset is published on Kaggle and is intended for routing and pathfinding applications. The specific scale, update date, and author are not provided in the available metadata.
An image dataset likely containing pictures of cats and dogs. The dataset is hosted on Kaggle, but its specific size, source, and creation details are not provided. The author and organization are unknown.
An evaluation dataset probing 18 Knowledge Graph-style reasoning tasks on the Qwen/Qwen3.5-2B-Base model. It was created by chayma-rhaiem and last updated on March 8, 2026. The dataset tests the model in its raw base form across parametric memory, standard grounded reasoning, and advanced grounded reasoning tasks.
A dataset from Kaggle likely containing news articles structured for graph neural network applications. The specific source, collection method, and temporal coverage are unknown. It is intended for machine learning practitioners working with graph-based text data.
Graph-PanNuke is a node-level classification dataset derived from the PanNuke pan-cancer histology dataset, using all slides at 40× magnification. Each tissue patch is converted into a cell-graph where nodes represent detected cell nuclei, with the task of predicting cell type across 5 classes. Node features describe cell morphology and texture.
News GNN is a dataset likely containing news articles structured for graph neural network applications. It is hosted on the Kaggle platform. The specific source, size, and collection methodology are not detailed in the available metadata.
A dataset titled 'exp73_gnn_lstm_fusion' was published on Kaggle. Its specific content and scale are unknown from the provided metadata. The title suggests it relates to an experiment combining Graph Neural Networks and Long Short-Term Memory architectures.
GNNModule is a dataset published on Kaggle, likely related to Graph Neural Networks. The dataset's specific content, size, and origin are not detailed in the available metadata. Users must download the dataset to verify its exact structure and potential applications in network science.
An identity graph dataset published on Kaggle. The dataset's title suggests it contains network data linking identities, likely for Q1 of an unspecified year. Metadata is minimal; actual content requires verification after download.
gnn_pipeline_output_Ray is a dataset containing the output of a graph neural network processing pipeline. The dataset is hosted on Kaggle, but its specific contents, size, and creation details are not described. Its title suggests it is likely intended for machine learning practitioners working with graph-structured data.
ToolMind-Web-QA consists of 6,000 complex question-answer pairs synthesized from Wikipedia entity-relation knowledge graphs, published by Nanbeige in February 2026. It includes search trajectories averaging 100 turns per record to facilitate research on search-augmented and long-horizon agents.