Firewall log data from 2008, curated by the TabArena team for evaluating predictive models on independent and identically distributed tabular data. The dataset is intended for classification tasks and originates from a study published at the 6th International Symposium on Digital Forensic and Security. It is licensed under CC BY 4.0.
Use Cases
- Train classification models to detect network intrusions based on firewall log features.
- Benchmark machine learning algorithms for independent and identically distributed (IID) tabular data tasks.
- Study patterns of network traffic and security threats using historical log data.
- Evaluate the performance of multiclass support vector machines on security log classification.
Strengths
- Dataset has a clear, permissive license (CC BY 4.0).
- Feature names have been cleaned by curators, removing whitespaces and special characters.
- Dataset is explicitly intended for evaluating ML models on IID tabular classification tasks.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count, sample data, and file formats are unknown, which may limit suitability assessment.
- Last update date is unknown; freshness unverified.
Provenance
- Source
- https://doi.org/10.24432/C5131M, referenced in F. Ertam and M. Kaya, 2018 ISDFS paper.
- Collection Method
- Curated from original firewall log files.
- Time Range
- 2008
- Freshness
- Dataset Year is 2008; last updated date is unknown.
- Geography
- null