Optimum power/performance pipeline depth
Abstract
The impact of pipeline length on both the power and performance of a microprocessor is explored both theoretically and by simulation. A theory is presented for a wide range of power/performance metrics, BIPSm/W. The theory shows that the more important power is to the metric, the shorter the optimum pipeline length that results. For typical parameters neither BIPS/W nor BIPS2/W yield an optimum, i.e., a non-pipelined design is optimal. For BIPS3/W the optimum, averaged over all 55 workloads studied, occurs at a 22.5 FO4 design point, a 7 stage pipeline, but this value is highly dependent on the assumed growth in latch count with pipeline depth. As dynamic power grows, the optimal design point shifts to shorter pipelines. Clock gating pushes the optimum to deeper pipelines. Surprisingly, as leakage power grows, the optimum is also found to shift to deeper pipelines. The optimum pipeline depth varies for different classes of workloads: SPEC95 and SPEC2000 integer applications, traditional (legacy) database and on-line transaction processing applications, modern (e.g. Web) applications, and floating point applications.