Defending Against Adversarial Attacks Using Denoising Autoencoders

2020-04-24T12:55:38Z (GMT) by Rehana Mahfuz
Gradient-based adversarial attacks on neural networks threaten extremely critical applications such as medical diagnosis and biometric authentication. These attacks use the gradient of the neural network to craft imperceptible perturbations to be added to the test data, in an attempt to decrease the accuracy of the network. We propose a defense to combat such attacks, which can be modified to reduce the training time of the network by as much as 71%, and can be further modified to reduce the training time of the defense by as much as 19%. Further, we address the threat of uncertain behavior on the part of the attacker, a threat previously overlooked in the literature that considers mostly white box scenarios. To combat uncertainty on the attacker's part, we train our defense with an ensemble of attacks, each generated with a different attack algorithm, and using gradients of distinct architecture types. Finally, we discuss how we can prevent the attacker from breaking the defense by estimating the gradient of the defense transformation.