Fast (Trapless) Kernel Probes Everywhere
Abstract
The ability to efficiently probe and instrument a running operating system (OS) kernel is critical for debugging, system security, and performance monitoring. While efforts to optimize the widely used Kprobes in Linux over the past two decades have greatly improved its performance, many fundamental gaps remain that prevent it from being completely efficient. Specifically, we find that Kprobe is only optimized for ~80% of kernel instructions, leaving the remaining probe- able kernel code to suffer the severe penalties of double traps needed by the Kprobe implementation. In this paper, we focus on the design and implementation of an efficient and general trapless kernel probing mechanism (no hardware exceptions) that can be applied to almost all code in Linux. We discover that the main limitation of current probe optimization efforts comes from not being able to assume or change certain properties/layouts of the target kernel code. Our main insight is that by introducing strategically placed nops, thus slightly changing the code layout, we can overcome this main limitation. We implement our mechanism on Linux Kprobe, which is transparent to the users. Our evaluation show an 10x improvement of probe performance over standard Kprobe while providing this level of performance for 96% kernel code.