Loading...
Loading...
Source code corpora, bug reports, vulnerability databases, network intrusion detection, malware samples
1,692 datasets
A classification of streams in and out of streams for the Jura department in France. The dataset was created by the Seref service of the DDT du Jura, the SMISA design office, and the French Agency for Biodiversity, following a Water Court decision from April 3, 2018. Data was updated on April 24, 2018, and the record was last updated on March 27, 2019.
Operation Cyber Centurion disrupts hundreds of cyber intrusion and ransomware attacks targeting critical infrastructure. The dataset contains information on these disruptions, compiled by the Department of Homeland Security. It was last updated in September 2025.
Networking and Information Technology Research and Development published a workshop report summarizing discussions from a June 4-6, 2019 event. The document assesses research challenges and opportunities at the intersection of artificial intelligence and cybersecurity. It was authored by the NITRD Program's AI R&D and Cyber Security and Information Assurance Interagency Working Groups.
ChaoticNeutrals processed and filtered ShareGPT conversational data for cybersecurity topics. The dataset was last updated on November 13, 2024. It likely contains cleaned and deduplicated text dialogues.
16 terabytes of uncompressed source code data across 22 programming languages and 23 file extensions. The collection originates from the public GitHub dataset on Google BigQuery and targets large-scale code modeling tasks.
Assessments of the effectiveness of budgetary programs managed by the Executive Committee. The data originates from the States site of Ukraine and was last updated in December 2022. The specific metrics and scale of the evaluations are not detailed in the available metadata.
The National Institute of Standards and Technology provides a structured framework for cybersecurity workforce development. It comprises Work Role Categories, Work Roles, Competency Areas, and Task, Knowledge, and Skill (TKS) statements, along with their interrelationships. The data is available in a machine-readable JSON format for system-to-system transmission.
Decision documents from the Executive Committee of Berezan City Council. The dataset was published on the eu_open_data platform and last updated on 2025-03-05. The organization listed is the States site of Ukraine.
2.6 million individual lines of Python 3 source code extracted from the CodeSearchNet repository. Each entry is a standalone, syntactically valid code snippet under 125 characters stored in a single 'text' column.
Phishing Email Dataset is a text classification resource for identifying malicious emails. The dataset is a copy of the 'Phishing Email Detection' collection originally created by Kaggle user 'Cyber Cop'. It was uploaded to Hugging Face by zefang-liu in January 2024.
CommitPackFT is a 2GB collection of high-quality code commit messages filtered to resemble natural language instructions, containing between 100,000 and 1,000,000 records. Developed by BigCode and released in August 2023, it serves as a fine-tuning variant of the larger CommitPack dataset for instruction-following tasks. The data is linked to the research findings in Arxiv paper 2308.07124.
Varash City Council executive committee budget program passports from Ukraine. The data likely contains structured descriptions of budget programs for the local budget. It was published on the States site of Ukraine and last updated on January 5, 2022.
A list of open datasets published by the Zaporizhzhya Regional Territorial Branch of the Antimonopoly Committee of Ukraine. The register was last updated on February 14, 2019. The data originates from a Ukrainian government agency.
Passports and reports of local budget programs from the Executive Committee of Nikopol. The data was last updated on 2022-01-04 and is provided by the States site of Ukraine via the eu_open_data platform. The specific file format is EXCEL XLSX.
A directory listing the structure and contact details for the executive committee of the Berdyansk City Council. The dataset likely contains information such as official names, roles, telephone numbers, and email addresses. It was published on the States site of Ukraine and last updated on August 5, 2020.
Members of the Executive Committee of Dubno City Council for its 7th democratic convocation. The data was published on the States site of Ukraine and last updated on 2021-11-04 09:30:42.292623. It likely contains a list of committee members and associated information.
Berdyansk, Ukraine's open data register for capital construction, reconstruction, and technical supervision projects. The dataset is owned by the management of the executive committee of the Berdyansk City Council and was last updated on October 13, 2020. It is available in CSV format via the eu_open_data platform.
List of concluded contracts for the Management of communal property of the executive committee of Kovel City Council. The dataset was published on the eu_open_data platform and last updated on 2021-01-16 12:08:27.480101.
Kryvyi Rih City Council and its Executive Committee provide a list of their current regulatory acts. The dataset is hosted on the States site of Ukraine and was last updated on May 3, 2023. The data is available in an Excel XLSX file format.
Decisions taken at the Executive Committee of Rivne City Council. The dataset is hosted on the EU Open Data portal and was last updated on June 20, 2023. The data likely contains records of official resolutions and directives from the local government body.