Graph Representation Learning based Vulnerable Target Identification in Ransomware Attacks
Abstract
The increased digitization of commercial and consumer workflows, accelerated cloud adoption, and the growing sophistication of cyber criminals have resulted in ransomware emerging as a major threat of cyber attacks in cloud and data services. While research in malware detection can be partially adapted for ransomware, specific ransomware infection patterns can be leveraged to improve the detection efficiency. In this paper, we focus on identifying vulnerable targets in ransomware attacks, aiming to accelerate the ransomware detection process as well as enable better data backup policies design. Specifically, we make three contributions. First, we characterize lexical features and hierarchical file structure features on those ransomware infected files and folders. Second, we model the data backup as an attributed tree graph, learn a new feature representation of the nodes with graph neural networks, and train a classifier based on the new features. Third, with real-world snapshot backup instances, we demonstrate the superior performance of the graph representation learning based approach over several baselines. Compared to the traditional full-scan approaches, our finding suggests that the vulnerable ransomware attack target identification can result in a more efficient ransomware detection process via focused inspection of most vulnerable data in the backups. Our method can also be easily integrated into existing ransomware detection systems for accelerated cyber resiliency.