Loading...
Loading...
Source code corpora, bug reports, vulnerability databases, network intrusion detection, malware samples
1,561 datasets
Deng, Q., Ramskold, D., Reinius, B. & Sandberg authored this single-cell RNA sequencing dataset. The data has been transposed and includes cell type annotations from the original source code. Column names were changed to a numerical order to avoid issues with name length.
Single-cell RNA sequencing data transposed and annotated with cell type labels from the original source code. The dataset was prepared by authors Deng, Ramskold, Reinius, and Sandberg and is shared under a CC-BY-4.0 license on OpenML. Column names were changed to a numerical order to avoid issues with name length.
1986 data on convicted persons' entry and exit from correctional custody. The dataset was gathered from official state prison records in 36 states, the Federal Prison System, the California Youth Authority, and the District of Columbia. It was created by the United States Department of Justice, Office of Justice Programs, Bureau of Justice Statistics.
The dataset contains results from incentivized laboratory experiments testing a resource division model with repeated interaction. The experiments were conducted by Dustin Tingley of Harvard University Press to study the 'dark side of the future' in bargaining. The data likely captures outcomes related to social efficiency and changes in bargaining strength.
A clinical case study and theoretical paper on Acceptance and Commitment Therapy (ACT) for individuals with psychosis. The work was authored by Julieann Pankey of the University of Nevada, Reno. It proposes acceptance, cognitive defusion, and values-oriented action as an alternative coping method to traditional symptom reduction approaches.
A 2025 transcript and prepared materials from the Canadian Minister for Women and Gender Equality's appearance before the House of Commons Standing Committee on the Status of Women. The dataset contains the official text of the ministerial testimony and related documents. It was published by Women and Gender Equality Canada and last updated in March 2026.
A book by George Bunn analyzing the history and process of U.S.-Soviet arms control negotiations, covering treaties from the Limited Test Ban Treaty to the CFE Treaty. The work examines the decision-making and committee structures involved in these diplomatic efforts. It is sourced from paperswithcode and is available under a closed license.
A dataset of Portable Executable (PE) malware files. The dataset is hosted on Kaggle. Its specific size, collection date, and authorship details are unknown.
A collection of spear phishing emails likely intended for security research and machine learning model training. The dataset is hosted on Kaggle, but its specific contents, size, and creation details are unknown. Columns and sample data are unavailable, requiring verification after download.
89 high-fidelity reasoning records focus on complex cybersecurity attack vectors, verified by experts. The dataset is designed for fine-tuning Large Language Models on advanced security analysis and threat logic. It was created by expertdata-factory and last updated on March 1, 2026.
A 2008 report cataloging all elevation data across Australian jurisdictions, compiled by the Intergovernmental Committee on Surveying and Mapping's Permanent Committee on Topographic Information. The document is provided by Geoscience Australia and covers terrestrial and marine topography.
498,255 training samples and 168,060 test samples of HTML-URL pairs labeled as benign or phishing. The dataset includes 975 benchmarks with base rates ranging from 5e-4 to 5e-2 and was updated in February 2026 with approximately 200,000 new samples collected between March and December 2025.
ansulev's CVE ChatโStyle MultiโTurn Cybersecurity Dataset contains approximately 300,000 Common Vulnerabilities and Exposures records published between 1999 and 2025. The dataset has been parsed, enriched, and converted into a conversational format. It was last updated on March 8, 2026.
NewsDataHub provides 3,000 English-language cybersecurity news metadata rows collected via its API. The dataset is designed for coverage trend analysis and comparative topic visibility over a six-month period. Rows were filtered for completeness and deduplicated by normalized title before export.
Financial management data from the Malyn City Executive Committee in Ukraine. The dataset likely contains records for approving refunds erroneously or excessively credited to the local budget. It was last updated on March 2, 2026, and is available in Excel formats.
Q4 2017-18 data provides the underlying volumes for reported performance of the CSG Customer Service, presented quarterly to the Performance and Contract Management Committee. The dataset includes metrics on calls, complaints, and telephony, as recognized by the London Borough of Barnet. It is noted that recorded email volumes do not reflect the total number of emails received by the council.
Q3 2016-17 data provides the underlying volumes behind reported performance for the CSG Customer Service, presented quarterly to the Performance and Contract Management Committee. The dataset includes metrics on calls, complaints, and telephony, sourced from the London Borough of Barnet. Email volumes are noted to not reflect total council emails received, as they include some webforms.
Underlying data and volumes behind reported performance for the CSG Customer Service, presented quarterly to a performance committee. The dataset includes metrics on calls, complaints, and telephony, though recorded email volumes do not reflect total council email receipts. Organization is the London Borough of Barnet.
London Borough of Barnet provides underlying data for quarterly performance reports on its Customer Service Group. The dataset tracks volumes and performance metrics for customer contacts, including calls, emails, and webforms. It covers the second quarter of the 2017-18 fiscal year.
London Borough of Barnet data provides underlying metrics for reported customer service performance, presented quarterly to the Performance and Contract Management Committee. The dataset includes volumes for calls, complaints, and telephony, though email and webform counts are acknowledged as incomplete.