Parallelism profiling and wall-time prediction for multi-threaded applications
Abstract
A detailed and accurate characterization of the parallelism of applications is essential for predicting their wall-time on different platforms, both for an application running in isolation and for a set of consolidated applications executing on the same platform. However, prevailing profilers are often based on sampling and do not provide exact information on the parallelism of the profiled application. In this paper we present a novel profiler that logs all thread scheduling activities within the operating system kernel. These logs enable us to accurately characterize applications' parallelism on a given platform by computing the number of threads that are active at each moment. We also present a simple mathematical prediction model to estimate wall-time for program execution on a k2-core machine using profiles collected using a k1-core machine (of the same architecture and running at the same clock speed). We use our profiler to assess the parallelism of several CPU-bound DaCapo benchmarks and evaluate the accuracy of our prediction model. © 2013 ACM.