DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Software Engineering & Security Datasets | DataSalon

All Categories

🔒

Software Engineering & Security

Source code corpora, bug reports, vulnerability databases, network intrusion detection, malware samples

1,591 datasets

Cybersecurity Weakness Remediation Plans from Department of Labor

Plans of Action and Milestones (POA&M) are corrective action plans required by the Department of Homeland Security for tracking and resolving information security weaknesses. The dataset contains these plans as assigned to agencies for remediation, with the last update recorded in January 2026. Specific details on the number of plans, rows, or columns are not provided in the input.

OcioOasam+1

0 views

Software Engineering & Security

Malware Data for Security Analysis

A dataset titled 'Malware' is hosted on Kaggle. The dataset's specific content, size, and origin are not detailed in the provided metadata. Its columns, sample data, and other descriptive attributes are currently unknown.

TabularMachine LearningMalwareCybersecurity+1

0 views

Software Engineering & Security

Malware Detection Dataset Sourced from Malware Bazaar

Malware samples acquired from the Malware Bazaar platform to create a dataset for detection tasks. The dataset's author, organization, and specific temporal coverage are not provided. The data is hosted on Kaggle and is tagged for cyber security applications.

TabularTextEnglishMalware AnalysisCyber SecurityMalware Detection+1

0 views

Software Engineering & Security

KDD Cup 1999: Computer Network Intrusion Detection

4,898,431 connection records categorized into 23 attack types and a 'normal' class. The data includes 41 features per connection, such as protocol_type, service, and src_bytes, derived from raw TCP dump data.

GamesBusiness+1

0 views

Software Engineering & Security

GitHub Pull Request Data from 24 Repositories

GitHub pull request data collected from 24 software repositories. The dataset likely contains information related to code review and collaboration processes on the platform. It is hosted on Kaggle, but specific details about its creation, size, and structure are not provided in the metadata.

TabularGithubVersion ControlPull RequestsSoftware Development+1

0 views

Software Engineering & Security

Commit Sklearn Lib: Scikit-learn Library Version Control Data

A dataset from Kaggle titled 'Commit Sklearn Lib'. The title suggests it contains version control data, likely commit histories, related to the scikit-learn machine learning library. The dataset's specific content, size, and origin are not detailed in the provided metadata.

TabularMachine LearningScikit LearnVersion ControlSoftware Commits+1

0 views

Software Engineering & Security

Malware and Benign Software Represented as Image Data

A collection of image files likely representing malware and benign software samples. The dataset is hosted on Kaggle, but details on the number of images, creation date, and author are unknown. Columns and sample data are unavailable for inspection.

ImageCybersecurityComputer VisionMalware Analysis+1

0 views

Software Engineering & Security

Phishing Dataset for Model Fine-Tuning

A dataset hosted on Kaggle, likely intended for fine-tuning machine learning models to detect phishing attempts. The title suggests it contains examples of phishing-related data, but specific content, size, and features are not detailed in the provided metadata. Further verification is required to confirm the dataset's structure and intended application.

TabularMachine LearningCybersecurityFine TuningPhishing+1

0 views

Software Engineering & Security

Phishing Email Dataset for Security Analysis

A dataset titled 'phishing-email' is hosted on Kaggle. The dataset's content, size, and specific attributes are not described in the provided metadata. Its actual composition and scale require verification after download.

TextCybersecurityEmail SecurityText ClassificationPhishing+1

0 views

Software Engineering & Security

Email Dataset for Phishing Detection

An email dataset focused on phishing content, sourced from Kaggle. The dataset likely contains email text and labels for phishing classification. Metadata is minimal; specifics about size, columns, and provenance are unknown.

TextSecurityText ClassificationPhishing+1

0 views

Software Engineering & Security

Phishing Email Dataset for Security Analysis

Phishing dataset(email) is a collection of email data hosted on Kaggle, likely intended for cybersecurity research. The dataset's specific content, size, and origin are not detailed in the provided metadata. Users must download the dataset to verify its structure and suitability for their tasks.

TabularCybersecurityEmail SecurityPhishing+1

0 views

Software Engineering & Security

Edge-IIoT Balanced Subset for Intrusion Detection

Edge-IIoT Balanced Subset for Intrusion Detection likely contains data related to cybersecurity in Industrial Internet of Things environments. The dataset is hosted on Kaggle, but specific details about its size, creator, and update date are unavailable. Columns likely suggest network traffic or system event logs.

TabularCybersecurityIntrusion DetectionNetwork securityIot+1

0 views

Software Engineering & Security

Synthetic Phishing Dataset for Security Research

Synthetic phishing dataset is hosted on Kaggle. The dataset likely contains simulated data for phishing detection tasks. Metadata is minimal; specifics about size, columns, and provenance are unknown.

TabularMachine LearningCybersecurityPhishingSynthetic+1

0 views

Software Engineering & Security

Phishing URL Dataset for Security Model Training

A dataset titled 'phishing_url' is hosted on Kaggle. The dataset likely contains URLs labeled as legitimate or phishing for security analysis. Metadata such as column details, size, and license are currently unknown.

TabularUrl ClassificationCybersecurityPhishing+1

0 views

Software Engineering & Security

Merged Phishing Dataset from Enron, Nazario, and SpamAssassin

A merged collection of email data from three sources: Enron, Nazario, and SpamAssassin. The dataset likely contains emails labeled as phishing or spam, intended for security research. It is hosted on Kaggle, but specific details about its size, structure, and creation date are unknown.

TextCybersecurityEmail SecurityText ClassificationPhishing+1

0 views

Software Engineering & Security

DDoS PCAPs: Network Packet Capture Files

Packet capture (PCAP) files likely containing network traffic data from Distributed Denial of Service (DDoS) attacks. The dataset is hosted on Kaggle, but details on its size, collection method, and time range are not provided in the metadata. The author, organization, and specific license are also unknown.

Time SeriesCybersecurityNetwork securityPacket Capture+1

0 views

Software Engineering & Security

Cybersecurity Dataset from Seneca

A cybersecurity dataset published on Kaggle. The title suggests it may contain network or system security data, potentially related to intrusion detection or threat analysis. The dataset's specific contents, size, and origin require verification after download.

TabularCybersecurityIntrusion DetectionNetwork security+1

0 views

Software Engineering & Security

Merged Phishing Email Dataset from Enron, Nazario, and SpamAssassin

A merged collection of emails from three established sources: the Enron corpus, the Nazario phishing corpus, and the SpamAssassin public corpus. The dataset is hosted on Kaggle, but specific details like row count, file formats, and license are not provided in the metadata. Its content likely contains a mix of legitimate, spam, and phishing emails for analysis.

TextSpam DetectionEmail SecurityText ClassificationPhishing+1

0 views

Software Engineering & Security

Phishing Email Dataset

A collection of emails likely related to phishing attacks, sourced from Kaggle. The dataset's specific size, origin, and temporal coverage are unknown. It is intended for analysis of deceptive email content.

TextCybersecurityEmail SecurityPhishing+1

0 views

Software Engineering & Security

PULLDD: Phishing URL Low Latency Detection Dataset

A dataset for detecting phishing URLs, published on Kaggle. The specific number of records, features, and collection methodology are not detailed in the available metadata. Further details about the dataset's origin, size, and structure require verification after download.

TabularCybersecurityPhishing DetectionNetwork securityUrl Analysis+1

0 views

PreviousPage 58 of 80Next