An analytical model for loop tiling and its solution
Abstract
The authors address the problem of estimating the performance of loop tiling, an important program transformation for improved memory hierarchy utilization. We introduce an analytical model for estimating the memory cost of a loop nest as a rational polynomial in tile size variables. We also present a constant-time algorithm for finding an optimal solution to the model (i.e., for selecting optimal tile sizes) for the case of doubly nested loops. This solution can be applied to tiling of three loops by performing an iterative search on the value of the first tile size variable, and using the constant-time algorithm at each point in the search to obtain optimal tile size values for the remaining two loops. Our solution is efficient enough to be used in production-quality optimizing compilers, and has been implemented in the IBM XL Fortran product compilers. This solution can also be used by processor designers to efficiently predict the performance of a set of tiled loops for a range of memory hierarchy parameters.