Loading...
Loading...
Source code corpora, bug reports, vulnerability databases, network intrusion detection, malware samples
1,561 datasets
UniVBench is a benchmark dataset for video editing tasks, organized into numbered folders representing specific test cases. The dataset was created by JianhuiWei and was last updated on March 26, -2026. The full structure and description are available on its Hugging Face dataset page.
The 2007 Cornwall Sea Fisheries Committee Fal Bay Underwater Camera Maerl Survey is a collation of marine environmental surveys. Surveys include Marine Conservation Zone verification, condition assessments, and benthic grab surveys, conducted to specified standards. The dataset is provided by Natural England and contains Ordnance Survey data.
Ransomware Threat Outlook 2025-2027 is an updated assessment from the Canadian Centre for Cyber Security. It details the early history, emerging trends, and projected impact of ransomware on Canadian organizations. The document is formatted as HTML and was last updated in March 2026.
Edinburgh Napier University's School of Computing created a modern cybersecurity dataset containing over 500,000 distinct files across 44 popular file types. The dataset was designed to address research reproducibility issues and is intended as a complement to the Govdocs1 dataset. It comprises more than 90 separate data subsets, including examples of file types with high entropy, which is a characteristic of ransomware activity.
18,783 distilled agent traces for CVE reproduction tasks, generated using the Claude Opus 4.5 model with a Mini SWE-Agent harness through the CVE-Factory pipeline. This is an expanded version of the cve_train dataset, adding approximately 3,000 tasks from the cve_tasks_3k_compressed set. The dataset was authored by Luoberta and last updated on March 27, -2026.
The Wyoming Headwaters Project refines mid-1970s USGS Hydrologic Unit Boundaries for the state. It defines 5th and 6th level watersheds at a 1:24,000-scale, with 5th level units between 40,000 and 250,000 acres and 6th level units between 10,000 and 40,000 acres. The project was conducted using ArcView and Arc/Info GIS, with final data to be hosted by the Wyoming Natural Resources Data Clearinghouse.
A 2008 report from the Intergovernmental Committee on Surveying and Mapping (ICSM) outlining all elevation data available across Australian jurisdictions. The audit was conducted by ICSM's Permanent Committee on Topographic Information (PCTI) and is hosted by Geoscience Australia. The dataset was last updated on the platform in March 2026.
SimulaMet's Moltbook Observatory Archive contains between 1 million and 10 million records exported from a live SQLite database as of March 2026. The data is structured into date-partitioned Parquet files, with each original database table represented as a distinct subset for efficient querying.
500 human-validated GitHub Issue-Pull Request pairs from popular Python repositories, curated by the SWE-bench team. This subset of the original benchmark focuses on high-quality samples verified for evaluation accuracy through manual review. It serves as a rigorous test for autonomous systems attempting to solve real-world software engineering tasks.
Comprising 39.37 million tokens of curated Luau source code across 29,215 unique files and 457 GitHub repositories. Developed by khtsly and updated in March 2026, it focuses on type-safe, functional architecture for code-specific model training.
Encompassing normalized snapshots of issues, pull requests, comments, reviews, and linkage data from the huggingface/transformers GitHub repository. It includes data across multiple linked tables such as issues.parquet, pull_requests.parquet, and comments.parquet. The dataset is intended for duplicate PR and issue analysis.
Artiverse is a dataset of articulated objects, focusing on diversity and physical grounding. The dataset is under active development and verification by the organization 3dlg-hcvc. It was last updated on March 26, 2026.
World Bank data on net official development assistance (ODA) and official aid received by countries, measured in constant 2023 US dollars. The dataset tracks disbursements of concessional loans and grants from DAC member agencies, multilateral institutions, and non-DAC countries. It originates from the World Development Indicators collection.
World Development Indicators provides data on net bilateral aid flows from Japan, a Development Assistance Committee (DAC) donor. The dataset measures net disbursements of official development assistance (ODA) or official aid, calculated as gross disbursements minus principal repayments on earlier loans. Data collection for official aid to certain advanced countries concluded in 2004.
Net bilateral aid flows from Germany track disbursements minus loan repayments to countries on the DAC recipient list. Data is collected from the Development Assistance Committee and reported in current U.S. dollars. The World Bank's World Development Indicators compiles this information.
Net bilateral aid flows from Development Assistance Committee (DAC) donors to Canada, measured in current US dollars. The data covers disbursements of official development assistance (ODA) and official aid, representing net disbursements (grants and concessional loans minus principal repayments). It is published by the World Bank as part of the World Development Indicators.
World Development Indicators provides data on net bilateral aid flows from Australia, a Development Assistance Committee (DAC) donor. The dataset tracks net disbursements of official development assistance (ODA) and official aid, calculated as grants and concessional loans minus principal repayments, measured in current U.S. dollars. Data collection for official aid flows to certain advanced countries concluded in 2004.
Net bilateral aid flows from France, measured in current US dollars, represent disbursements of official development assistance minus loan repayments. The data is compiled by the World Bank's World Development Indicators from reports by Development Assistance Committee (DAC) members. It covers financial transfers from France to countries on the DAC list of ODA recipients.
A briefing package prepared for the Associate Vice President of the Canadian Food Inspection Agency ahead of a parliamentary committee appearance. The document addresses the Minister of Health's mandate letter and was published by the CFIA in February 2026. It is a single text document in PDF format.
Meeting minutes from a 1973 JOIDES planning committee session detail strategic discussions on the Deep Sea Drilling Project's post-1975 future. The Australian Ocean Data Network hosts this legacy document, which lacks a formal abstract. This record was last updated in April 2026.