Loading...
Loading...
Source code corpora, bug reports, vulnerability databases, network intrusion detection, malware samples
1,561 datasets
A dataset for malware detection tasks, sourced from the Kaggle platform. The dataset's specific content, size, and creation details are not provided in the available metadata.
1.3M+ source code files from approximately 4,700 top-ranked GitHub developers, curated by ronantakizawa and updated in February 2026. The collection spans 80+ programming languages including Python, Rust, and Go, covering the period from 2015 to 2025.
Manuscript data related to audit committees and financial distress, likely containing tabular records for analysis. The dataset was authored by Oladejo, Titilayo and is hosted on the Dataverse platform. It was last updated on 2026-04-21.
Julie Slayton authored this Bulletin as part of the OJJDP's Juvenile Accountability Incentive Block Grants (JAIBG) Best Practices Series. The JAIBG program was initially funded in fiscal year 1998 and is based on the premise that holding juvenile offenders accountable for their law-violating behavior is key to improving community quality of life. The document discusses principles for establishing and maintaining interagency information sharing within this context.
Resources from the Communications Security Establishment Canada's Cyber Centre provide guidance on ransomware threats. The materials are aimed at Canadian organizations and citizens for protective action. The dataset was last updated in March 2026.
New York City property owners annually self-report bedbug infestation history for multiple dwellings under Local Law 69 of 2017. Each record represents a filing submission, with subsequent filings for the same period superseding earlier ones. The dataset is provided in multiple formats including CSV, JSON, RDF, and XML.
A final report evaluating the effectiveness and efficiency of the Royal Canadian Mounted Police Reserve Program. The evaluation was conducted by RCMP National Program Evaluation Services to fulfill a commitment to the Treasury Board by the 2025/26 fiscal year. The document was published by the Royal Canadian Mounted Police and last updated in March 2026.
A 2006-2007 marine survey collected multibeam, drop-down video, and stills camera images to study Annex I reef habitats in the Mid Irish Sea. The survey involved specialists from the National Oceanography Centre, ERT (Scotland) Ltd, and the University Marine Biological Station, funded by Defra. Data components are archived separately at the British Geological Survey, DASSH, and UKHO.
London Borough of Barnet publishes annual Infrastructure Funding Statements detailing developer contributions and highway works. The dataset covers financial and non-financial contributions from Section 106 and CIL agreements, as well as Section 278 highway works, from April 1, 2019, onward. It provides an overview of contributions received, future commitments, and projects delivered within the borough.
Incident-level cybersecurity reports covering breaches, ransom events, and downtime. The dataset likely contains metadata related to incident response. It is hosted on Kaggle, but the author, organization, and last update date are unknown.
A fork of the AIDev dataset containing all commits and repositories. The dataset is associated with a paper from arXiv and GitHub repositories. It was last updated on March 18, 2026.
Malware APKs dataset from Kaggle. The dataset likely contains Android Package Kit (APK) files or metadata related to malicious software. Specific details on size, columns, and provenance are unknown.
A collection of datasets related to cybersecurity topics, including CAPTCHA and Data Protection Officer (DPO) functions, hosted on HuggingFace. The dataset was published by the user 'areyouevenreal' and was last updated on April 21, 2026. Its specific contents, size, and structure are not detailed in the available metadata.
Hybrid Phishing URL Detection likely contains data for distinguishing malicious phishing URLs from legitimate ones. The dataset is hosted on Kaggle, a popular platform for data science and machine learning projects. Specific details on its size, creation date, and author are not provided in the available metadata.
10,000 courses from Udemy's development category, including web development, data science, and programming languages. The dataset includes 17 columns such as course title, subscription numbers, average ratings, and pricing details. It was published on OpenML under a CC0-1.0 license.
10,000 development-related course listings extracted from Udemy's website, covering domains like web development, data science, and mobile apps. The dataset includes 17 columns such as subscription counts, ratings, pricing, and publication dates. It is shared under a CC0 1.0 license on the OpenML platform.
SecureCode Web is a production-grade dataset for web and application security vulnerabilities. It contains 2,185 examples with complete incident grounding and a 4-turn conversational structure. The dataset was created by scthornton and was last updated on Hugging Face in February 2026.
Geoscience Australia Data provides a historical report from an open meeting of the JOIDES planning committee. The document, titled 'Future of the Deep Sea Drilling Project after 1975', likely contains discussions and planning notes from a meeting held in Zurich, September 26-28, 1973. The dataset was last updated on the data_gov_au platform in March 2026.
A dataset named 'New_Dike_Malware_Ds' published on Kaggle. The title suggests it contains information related to the Dike malware family. The dataset's author, organization, and specific collection details are not provided.
Cybersecurity data aggregated from the Kaggle platform. The dataset's specific content, scale, and origin are not detailed in the provided metadata. Users must download the data to inspect its actual records, features, and potential applications for security analysis.