On limitations of network acceleration
Abstract
The performance of large-scale data-intensive applications running on thousands of machines depends considerably on the performance of the network. To deliver better application performance on rapidly evolving high-bandwidth, low-latency interconnects, researchers have proposed the use of network accelerator devices. However, despite the initial enthusiasm, translating network accelerator's capabilities into high application performance remains a challenging issue. In this paper, we describe our experience and discuss issues that we uncover with network acceleration using Remote Direct Memory Access (RDMA) capable network controllers (RNICs). RNICs offload the complete packet processing into network controllers, and provide direct userspace access to the networking hardware. Our analysis shows that multiple (un)related factors significantly influence the performance gains for the end-application. We identify factors that span the whole stack, ranging from low-level architectural issues (cache and DMA interaction, hardware pre-fetching) to the high-level application parameters (buffer size, access pattern). We discuss implications of our findings upon application performance and the future of integration of network acceleration technology within the systems. Copyright 2013 ACM.