Exploitation of APL data parallelism on a shared-memory MIMD machine
Abstract
Programs written m APL implicitly contain data parallelism because the high level APL primitives denoting array o erations may be executed in parallel. Our experiment APL/C compiler translates ordinary APL programs into the C language with additional parallel constmcts for synchronization support. We target the RP3, a shared-memory MIMD machine built at IBM T.J. Watson Research Center, running the Mach operating system. The compiler uses Mach kernel prirmtives to build a parallel run-Time environment to reduce the run-Time overhead. We have developed a novel method for dynamically determining the required number of processors for a primitive function. The compiler can further apply a form of the loop fusion technique to groups of scalar primitive functions to gain additional improvement.