MARVIN: Efficient and Comprehensive Mobile App Classification through Static and Dynamic Analysis
Abstract
Android dominates the smartphone operating system market and consequently has attracted the attention of malware authors and researchers alike. Despite the considerable number of proposed malware analysis systems, comprehensive and practical malware analysis solutions are scarce and often short-lived. Systems relying on static analysis alone struggle with increasingly popular obfuscation and dynamic code loading techniques, while purely dynamic analysis systems are prone to analysis evasion. We present MARVIN, a system that combines static with dynamic analysis and which leverages machine learning techniques to assess the risk associated with unknown Android apps in the form of a malice score. MARVIN performs static and dynamic analysis, both off-device, to represent properties and behavioral aspects of an app through a rich and comprehensive feature set. In our evaluation on the largest Android malware classification data set to date, comprised of over 135,000 Android apps and 15,000 malware samples, MARVIN correctly classifies 98.24% of malicious apps with less than 0.04% false positives. We further estimate the necessary retraining interval to maintain the detection performance and demonstrate the long-term practicality of our approach.