The Devil is in the GAN: Backdoor Attacks and Defenses in Deep Generative Models
Abstract
Deep Generative Models (DGMs) are a popular class of models which find widespread use because of their ability to synthesise data from complex, high-dimensional manifolds. However, even with their increasing industrial adoption, they have not been subject to rigorous security analysis. In this work we examine backdoor attacks on DGMs which can significantly limit their applicability within a model supply chain and cause massive reputation damage for companies outsourcing DGMs form third parties. DGMs are vastly different from their discriminative counterparts and manifestation of attacks in DGMs is largely understudied. To this end we propose three novel training-time backdoor attacks which require modest computation effort but are highly effective. Furthermore, we demonstrate their effectiveness on large-scale industry-grade models across two different domains - images (StyleGAN) and audio (WaveGAN). Finally, we present an insightful discussion and prescribe a practical and comprehensive defense strategy for safe usage of DGMs