Mapping algorithms onto a multiple-chip data-driven array
Abstract
Data-driven arrays provide high levels of parallelism and pipelining for algorithms with no internal regularity. Most of the methods previously developed for mapping algorithms onto processor arrays assumed an unbounded array (i.e., one in which there will always be a sufficient number of processing elements (PEs) for the mapping). Implementing such an array is not practical. A more practical approach would be to assign the PEs to chips and map the given algorithm onto the new array of chips. The authors describe a way to directly map algorithms onto a multiple-chip data-driven array, where each chip contains a limited number of PEs. There are two optimization steps in the mapping. The first is to produce an efficient mapping by minimizing the area (i.e., the number of PEs used) as well as optimizing the performance (pipeline period and latency) for the given algorithm, or finding a trade-off between area and performance. The second is to divide the unbounded array among several chips each containing a bounded number of PEs.