AndroMD Android Malware Dataset
Description
he AndroMD dataset is a large-scale Android malware dataset comprising 600,298 apps, including 397,814 legitimate and 202,484 malicious samples, collected from sources like AndroZoo and VirusShare between 2010 and 2023. It consists of three subsets: KeyCount (frequency of suspicious code tokens), ZeroOne (binary presence of malicious indicators), and MNF (permissions from AndroidManifest.xml). Features were extracted through automated decompilation and static analysis, focusing on behavior patterns such as HTTP connections, file access, encryption, and sensitive permissions. Designed for machine learning, the dataset supports robust malware detection and is a core component of the AndroMD framework's classification and live detection capabilities.