Analog-AI: Hardware Acceleration for Deep Neural Network Inference
Abstract
Analog non-volatile memory (NVM)-based accelerators for Deep Neural Networks (DNNs) can achieve high-throughput and energy-efficient multiply-accumulate (MAC) operations by taking advantage of massively parallelized analog compute, implemented with Ohm’s law and Kirchhoff’s current law on arrays of resistive memory devices. Competitive end-to-end DNN accuracies can be obtained if weights are accurately programmed onto NVM devices and MAC operations are sufficiently linear. This presentation discusses the origin of this opportunity as well as the challenges inherent in delivering on it. We briefly discuss materials and devices for analog volatile and non-volatile memory, and circuit and architecture choices and challenges, before describing a 14-nm inference chip consisting of multiple 512x512 tiles (crossbar arrays) of Phase Change Memory (PCM) devices. Multiple DNN benchmarks are implemented using a circuit-switched 2D mesh for massively-parallel data-transport between the tiles.