UCI Machine Learning Repository hosts a dataset of shell commands executed by participants during hands-on cybersecurity training exercises. The data is tabular and captures user behavior in command-line environments. The original creator and specific collection timeframe are not documented.
Use Cases
- Model normal vs. malicious command sequences using the recorded shell command strings for anomaly detection systems.
- Analyze command frequency and patterns from the user behavior data to profile typical trainee actions.
- Study the progression of command complexity across training sessions if temporal metadata is present.
- Train classifiers to predict user expertise level based on command syntax and tool usage features.
Strengths
- Data originates from a controlled, hands-on training environment, providing realistic command-line interaction context.
- Hosted on the UCI Machine Learning Repository, a recognized source for benchmark datasets.
Limitations
- Unknown sample size and number of participants limits statistical reliability.
- Lack of documented column names and data schema complicates immediate analysis.
- Potential bias towards training-specific commands, not representative of general or adversarial user behavior.
Provenance
- Source
- UCI Machine Learning Repository
- Collection Method
- Collected from shell sessions of participants during structured cybersecurity training exercises.
- Time Range
- null
- Freshness
- null
- Geography
- null