DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Software Engineering & Security Datasets | DataSalon

All Categories

🔒

Software Engineering & Security

Source code corpora, bug reports, vulnerability databases, network intrusion detection, malware samples

2,230 datasets

VUDENC: GitHub Commits with Diff Files

VUDENC is a dataset of software commits collected from GitHub, including the associated diff files that show code changes. The data is likely used for research in software engineering, such as analyzing commit patterns or detecting vulnerabilities. It is authored by Laura Wartschinski from Humboldt-Universität zu Berlin and is available under an Open Access license.

TextSoftware EngineeringVersion ControlCode DiffsGithub CommitsInformation Retrieval+1

0 views

Software Engineering & Security

Magnetic Field Configurations for Precise Quasisymmetry

Matt Landreman from the University of Maryland, College Park created this data archive to support the paper "Magnetic fields with precise quasisymmetry". The archive contains data and source code used for the research. The specific volume, format, and temporal scope of the data are not detailed in the provided metadata.

TabularMagnetic FieldsFusion EnergyPlasma PhysicsQuasisymmetry+1

0 views

Software Engineering & Security

Aquatic Substrate Spectral Library for Adelaide Coastal Waters, 2003

Adelaide coastal waters are the focus of this spectral library for aquatic substrates, hosted in the National Spectral Database. The data was collected as part of the Adelaide Coastal Waters Study, with a final technical report published in 2007 by David Blackburn Environmental Pty Ltd and CSIRO Land and Water. The record provides access to source data for remote sensing studies of marine and coastal features.

Geospatial🇦🇺 AustraliaCoastal WatersMarine ScienceSpectral Library+1

0 views

Software Engineering & Security

BMR Marine Program Report: Forward Marine Program Committee Findings

A legacy report from the Bureau of Mineral Resources (BMR) committee outlining a forward marine program. The report is published by the Australian Ocean Data Network on data.gov.au. No abstract or detailed metadata is available for this product.

Text🇦🇺 AustraliaMarine ScienceGovernment ReportOcean Policy+1

0 views

Software Engineering & Security

Serum sST2 and sIL1RAcP Levels in Lean and Obese Individuals

Serum levels of soluble ST2 (sST2) and soluble IL1RAcP (sIL1RAcP) measured in lean and obese individuals. The dataset includes measurements in ng/ml along with age, sex, and BMI for each sample. Data was collected by Henry McSorley from the Edinburgh Adipose Tissue Biobank and last updated in June 2026.

TabularExcelClinical MeasurementsBiomarkersObesityHealth DataSerum Analysis+1

0 views

Software Engineering & Security

GLA Expenditure Over £250: Greater London Authority Spending Records

Monthly reports from June 2012 onward detail all Greater London Authority (GLA) expenditures exceeding £250, including VAT. The data is published by the GLA to provide financial transparency, with reports available in CSV and PDF formats. The reporting threshold was previously £1,000 until summer 2010, then £500 until June 2012.

TabularFinancial TransparencyGovernment ExpenditurePublic SpendingFinanceGreater London Authority+1

0 views

Software Engineering & Security

London Tube Network Performance Metrics from 2003/04 to 2010/11

Greater London Authority data on the performance of the London Underground between 2003/04 and 2010/11, used to inform a London Assembly Transport Committee report. The dataset includes metrics on delays, customer satisfaction, service operated, and journey times. The data was published in September 2011.

TabularTime SeriesLondonPublic TransitTransportationPerformance Metrics+1

0 views

Software Engineering & Security

NOAA GW Test Cases: Observations for Global Weather Model Development

A collection of observations necessary to run the National Weather Service's Global Workflow (GW) test cases. The data includes non-restricted observations in BUFR and IODA formats, experimental atmospheric and marine observations, aerosol emission data, and GFS analysis restarts. The workflow drives major NOAA global models like the Global Forecast System (GFS) and Global Ensemble Forecast System (GEFS).

Time SeriesGeospatialAtmospheric DataNumerical Weather PredictionGlobal WorkflowMeteorological ObservationsWeather Forecasting+1

0 views

Software Engineering & Security

JIUTIAN-TReB: A Multi-Dimensional Table Reasoning Benchmark with 7,790 Test Cases

JT-LM created the TReB dataset to evaluate large language models on table reasoning, comprehension, and processing. It contains 7,790 high-quality test cases spanning a spectrum from fundamental language understanding to advanced data analysis. The dataset was last updated on June 17, 2026.

TabularMulti Task BenchmarkBenchmarkLanguage Model EvaluationTable ReasoningHierarchical Evaluation+1

0 views

Software Engineering & Security

Exploit Database: 1,400 Curated Cybersecurity Vulnerabilities from 2021-2025

1,400 curated entries of cybersecurity vulnerabilities designed for training a Red Team GPT model. The dataset contains detailed records from 2021 to 2025, sourced from Exploit-DB, CVE details, and recent web sources like the CISA KEV catalog and The Hacker News. It was created by author hardik994 and last updated on the platform in June 2026.

TabularCybersecurityExploit DatabaseVulnerability DataPenetration Testing+1

0 views

Software Engineering & Security

Aquatic Substrate Spectral Library for Australian Coastal Waters, 2001

An Australian National Spectral Database record for aquatic substrate data collected in 2001. The dataset is part of a remote sensing study cited in a 2007 technical report for the Adelaide Coastal Waters Study. It is hosted by the Australian Ocean Data Network and was last updated in June 2026.

Geospatial🇦🇺 AustraliaSpectral LibraryAquatic SubstrateMarine Coastal+1

0 views

Software Engineering & Security

Cybersecurity Governance Literature in the Global South, 947 Articles 2000–2024

947 peer-reviewed articles on cybersecurity governance retrieved from Scopus between 2000 and 2024. The dataset supports bibliometric analysis of publication trends, thematic clusters, and collaboration networks. Data were processed using VOSviewer for co-citation and keyword co-occurrence analysis.

TabularExcelCybersecurity PolicyBibliometricsGlobal SouthResearch Literature+1

0 views

Software Engineering & Security

UK Office of Rail and Road High-Value Supplier Payments Over £25,000

Monthly records of payments exceeding £25,000 made by the UK's Office of Rail and Road to its suppliers. This data is published as part of the UK Government's commitment to transparency in public expenditure. The dataset is licensed under the Open Government Licence (OGL-UK-3.0) and was last updated in May 2026.

TabularCSVGovernment SpendingRail TransportPublic procurementTransparency+1

0 views

Software Engineering & Security

SSGIE-KFCM: Source Code for Hyperspectral Band Selection Algorithm

An algorithm's source code for selecting optimal bands from hyperspectral remote sensing images. The code implements a two-stage optimization framework using kernel fuzzy C-means clustering and an improved firefly algorithm. Author Dandan He published the code on figshare in April 2026.

TabularGeospatialExcelAlgorithm Source CodeComputer VisionAlgorithmHyperspectral ImageryBand Selection+1

0 views

Software Engineering & Security

NSW Mining Coalfields Boundaries and Areas

Five defined coalfields in New South Wales, Australia, showing the names and areas containing large coal deposits. This spatial dataset was defined by the Standing Committee on Coalfield Geology in 1976 and is published by the Department of Regional New South Wales. It was last updated on 2026-05-13.

GeospatialCoalfieldsNatural ResourcesNsw AustraliaMining+1

0 views

Software Engineering & Security

Sources for Gamification of Software Testing Systematic Review

The sources and analysis results used in the systematic review paper 'Gamification of Software Testing - an MLR'. Mika Mäntylä authored the paper, which was presented at the 17th International Conference on Product-Focused Software Process Improvement (PROFES) in November 2016. The dataset likely contains the raw data from the literature review process.

TabularSystematic ReviewSoftware EngineeringGamificationSoftware Testing+1

0 views

Software Engineering & Security

CDSB - 2026 Small Business Grants Recipients and Funding

2026 grant recipients and funding commitments from the Customer Services, Open Data and Small and Family Business (CDSB) Grants Programs in Queensland, Australia. The data was published by the Department of Customer Services, Open Data and Small and Family Business and last updated in May 2026. Grant funding amounts reflect the commitments at the time of the award announcement.

Tabular🇦🇺 AustraliaCSVSmall BusinessPublic FundingGovernment Grants+1

0 views

Software Engineering & Security

RENA: Quebec Register of Businesses Ineligible for Public Contracts

The Register of Businesses Ineligible for Public Contracts (RENA) records Quebec businesses that have committed offences under Schedule 1 of the Act respecting contracts by public bodies. It also registers companies refused or revoked authorization by the Autorité des Marches Publics. The Government and Municipalities of Québec maintain this register, last updated on 2026-04-17.

TabularXMLJSONExcelBusiness RegistryPublic Contracts+1

0 views

Software Engineering & Security

Malware Source Code and Decompiled C Pseudocode from Vxunderground

A curated dataset of malware source codes and C-like pseudocode obtained through automated decompilation using Ghidra. The dataset is intended for malware analysis, program analysis research, and machine learning-based detection. It was created by author 'qwewerewq' and last updated on 2026-06-14.

TextCybersecuritySource CodeMalware AnalysisReverse Engineering+1

0 views

Software Engineering & Security

RTMLib Models: Computer Vision Model Weights from OpenMMLab

A collection of RTMLib model weights, likely for computer vision tasks such as pose estimation, stored locally by the author. The dataset was uploaded on 2026-06-08 following the shutdown of OpenMMLab servers. The author, DavidPagnon, notes that some models are missing and requests community help to locate them.

MultimodalRtmlibComputer VisionOpenmmlabPose EstimationModel Weights+1

0 views

PreviousPage 22 of 111Next