Reducing communication time through message prefetching
Abstract
The latency of large messages often leads to poor performance of parallel applications. In this paper, we investigate a novel latency reduction technique where message receivers prefetch messages from senders before the matching sends are called. When the send is finally called, only the parts of the message that have changed since the prefetch need to be transmitted, resulting in a smaller message. Our message prefetching technique initiates communication while the sender is still in the computation phase and thus overlaps computation with communication to hide part of the message latency. We implement and evaluate our technique in the context of an MPI runtime library. The results show that the execution speed of five MPI applications improves by up to 24% when message prefetching is enabled.