Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A 2021 dataset curated for the TabArena Tabular ML IID Study, intended for evaluating classification models on independent and identically distributed data. It originates from academic research on Android malware detection using native and custom permission features. The TabArena team removed a high proportion of duplicate rows present in the original data.
The dataset was heavily deduplicated (75% duplicates removed) by the curators, which significantly alters its size and distribution from the original source.