Source code functions categorized into binary security classes for identifying software vulnerabilities such as resource leaks and use-after-free errors. The dataset labels code as secure (0) or insecure (1) to facilitate the training of automated defect detection systems.
Use Cases
- Train a binary classification model to predict the target label using the provided source code functions
- Benchmark the performance of code-centric language models on identifying security-critical defects
- Develop automated security auditing tools that flag potential resource leaks or use-after-free vulnerabilities in C/C++ code
Strengths
- Binary classification labels where 1 represents insecure code and 0 represents secure code
- Covers specific vulnerability types including resource leaks, use-after-free, and DoS attacks
- Focuses on function-level source code snippets rather than entire files or projects