當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

利用FGSM实现对抗样本攻击

發布時間：2023/12/31 编程问答 23 豆豆

生活随笔收集整理的這篇文章主要介紹了利用FGSM实现对抗样本攻击小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

對抗樣本的線性解釋

數字圖像通常采用每個像素8bit來編碼，因此會拋棄小于1/255的信息。設原始圖像為 $x\bm{x}$ ,擾動噪聲為 $η\bm{\eta}$ ,擾動之后的圖像為：
$x~=x+η\tilde{\bm{x}}=\bm{x}+\bm{\eta}$
如果 $η\bm{\eta}$ 小于特征的精度，那么分類器如果做出不同的相應是不合理的。格式上的，對于well-separated類，我們期望的是分類器對于 $x~\tilde{\bm{x}}$ 和 $x\bm{x}$ 分配相同的類別只要最大范數 $∥η∥∞<?\parallel \bm{\eta}\parallel_{\infty }<\epsilon$ 。 $∥η∥∞=max?(∣η1∣,∣η2∣,...,∣ηn∣)\parallel \bm{\eta}\parallel_{\infty }=\max{(|\bm{\eta}_{1}|,|\bm{\eta}_{2}|,...,|\bm{\eta}_{n}|)}$ ，其中 $?\epsilon$ 是一個足夠小的，無法被感知到的數。考慮在權重 $w\bm{w}$ 和對抗樣本 $x~\tilde{\bm{x}}$ 之間的點乘：
$wTx~=wTx+wTη\bm{w}^{T}\tilde{\bm{x}}=\bm{w}^{T}\bm{x}+\bm{w}^{T}\bm{\eta}$ 擾動 $η\bm{\eta}$ 被增長激活通過 $wTη\bm{w}^{T}\bm{\eta}$ 。我們最大化增長激活通過 $η=sign(w)\bm{\eta}=sign(\bm{w})$ 。因為加減1以內的數都是無法被感知的，所以采用 $s i g n$ 函數是最大化的擾動值。這里假設 $w\bm{w}$ 具有 $n$ 維，權重向量的平均數量級為 $m$ ，那么通過點乘之后激活被增加到 $?mn\epsilon m n$ .雖然 $?\epsilon$ 作為一個常數是不變的，但是維度 $n$ 會隨著線性增長伴隨著高維空間，此時對于輸入的無窮小改變則會引起輸出較大的改變。

對于非線性模型的線性擾動

上圖展示了Fast adversarial examples應用在GoogleNet上。通過加上一個人類感知不到的小向量，向量的值為 $?\epsilon$ 乘輸入像素點關于誤差的梯度值的符號值。這里 $?=0.007\epsilon=0.007$ ,符合經過GoogleNet轉換為實數后8bit圖像編碼的最小數量級。圖中加上噪聲后，熊貓被識別為長臂猿，并且置信度為99.3%，所以此方法叫做fast gradient sign method，簡稱FGSM。

實現FGSM算法：

class Attack(object):def __init__(self, net):self.net = netself.criterion = F.cross_entropydef fgsm(self, x, y, eps=0.03, x_val_min=-1, x_val_max=1):x_adv = xx_adv.requires_grad = Truelogits = self.net(x_adv)cost = -self.criterion(logits, y)self.net.zero_grad()if x_adv.grad is not None:x_adv.grad.data.fill_(0)cost.backward()x_adv.grad.sign_()x_adv = x_adv - eps*x_adv.gradx_adv = torch.clamp(x_adv, x_val_min, x_val_max)return x_adv

使用foolbox實現對pre-trained模型攻擊

安裝：pip install foolbox==2.0.0
需要提供一張imagenet數據集中的一張圖片并命名為val.jpeg和對應的標簽值。這里采用pytorch預訓練的ResNet-18模型進行攻擊。

import foolbox import torch import torchvision.models as models import numpy as np import cv2 import os# instantiate the model resnet18 = models.resnet18(pretrained=True).eval() if torch.cuda.is_available():resnet18 = resnet18.cuda() mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1)) std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1)) fmodel = foolbox.models.PyTorchModel(resnet18, bounds=(0, 1), num_classes=1000, preprocessing=(mean, std))# get source image and label image_path = 'val.JPEG' image = cv2.imread(image_path) image = cv2.resize(image, (224,) * 2)[..., ::-1].transpose((2, 0, 1)).astype(np.float32) image = image / 255. # because our model expects values in [0, 1] image = np.expand_dims(image, axis=0) label = np.array([0])print('True label', label) print('predicted class', np.argmax(fmodel.forward(image), axis=1))# apply attack on source image attack = foolbox.attacks.FGSM(fmodel) # 如果攻擊失敗，返回全是nan的張量 adversarial = attack(image, label, epsilons=[0.1], max_epsilon=0) np.save('adversarial.npy', adversarial) # 保存攻擊之后的圖像print('adversarial class', np.argmax(fmodel.forward(adversarial), axis=1))

輸出：

True label 0 predicted class 0 adversarial class 997

這里解釋一下attack對象的參數：
call(self, input_or_adv, label=None, unpack=True, epsilons=1000, max_epsilon=1)
Parameters:

input_or_adv: numpy.ndarray or Adversarial The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label:int The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack:bool If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons:int or Iterable[float] Either Iterable of step sizes in the direction of the sign of the gradient or number of step sizes between 0 and max_epsilon that should be tried.
max_epsilon:float Largest step size if epsilons is not an iterable.

如果epsilons是int型的數，那么

epsilons= np.linspace(0, max_epsilon, num=epsilons + 1)[1:]

或者epsilons是一個可迭代的浮點數，比如：

epsilons=[0.1, 0.2, 0.3]

FGSM會從小的epsilon開始進行攻擊，直至找到可以攻擊成功的epsilon值。若攻擊失敗，則會返回全是nan的數組。若僅想要使用一個epsilon值來攻擊，則可以設置epsilons=[0.1]和max_epsilon=0，即：

adversarial = attack(image, label, epsilons=[0.001], max_epsilon=0)

遍歷整個數據集得到攻擊結果

import foolbox import torch import torchvision.models as models import numpy as np import torchvision.datasets as datasets import torchvision.transforms as transforms import osos.environ["CUDA_VISIBLE_DEVICES"] = '1,2' # instantiate the model model = models.resnet101(pretrained=True).cuda().eval() model = torch.nn.DataParallel(model) mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1)) std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1)) fmodel = foolbox.models.PyTorchModel(model, bounds=(0, 1), num_classes=1000, preprocessing=(mean, std))# get source image and labelval_dir = '/home/ws/winycg/imagenet/ILSVRC2012_img_val/' val_dataset = datasets.ImageFolder(val_dir, transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),])) val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=100, shuffle=False,num_workers=16, pin_memory=(torch.cuda.is_available()))# apply attack on source image attack = foolbox.attacks.FGSM(fmodel) # 記錄未成功攻擊的樣本數 def adversarial_num(output):correct_num = 0for i in range(output.size(0)):if torch.isnan(output[i])[0, 0, 0]:correct_num += 1return correct_numdef acc_num(output, target, topk=(1,)):"""Computes the precision@k for the specified values of k"""maxk = max(topk)_, pred = output.topk(maxk, 1, True, True)pred = pred.t()correct = pred.eq(target.view(1, -1).expand_as(pred))number = []for k in topk:correct_k = correct[:k].view(-1).float().sum(0).item()number.append(correct_k)return numbertotal = 0 ori_top1_correct = 0 adv_top1_correct = 0for batch_idx, (inputs, targets) in enumerate(val_loader):inputs, targets = inputs, targetsori_output = fmodel.forward(inputs.numpy())ori_top1_correct += acc_num(torch.from_numpy(ori_output), targets, (1,))[0]adversarial_input = attack(inputs.numpy(), targets.numpy(), epsilons=[0.3], max_epsilon=0)#adv_outputs = fmodel.forward(adversarial_input)adv_top1_correct += adversarial_num(torch.from_numpy(adversarial_input))total += targets.size(0) ori_top1_acc = ori_top1_correct / total adv_top1_acc = adv_top1_correct / totalprint('original top1 accuracy:', ori_top1_acc) print('adversarial top1 accuracy:', adv_top1_acc)

可視化加噪聲后的樣本以及sign圖

import matplotlib.pyplot as plt import cv2 import numpy as npplt.subplot(1, 3, 1) plt.title('Original') image_path = 'val.JPEG' image = cv2.imread(image_path) image = cv2.resize(image, (224,) * 2)[..., ::-1] / 255 plt.imshow(image) # division by 255 to convert [0, 255] to [0, 1] plt.axis('off')plt.subplot(1, 3, 2) plt.title('Adversarial') adversarial = np.load('adversarial.npy').transpose((1, 2, 0)) plt.imshow(adversarial) plt.axis('off')plt.subplot(1, 3, 3) plt.title('Difference') sign = np.sign(adversarial - image) plt.imshow((sign- sign.min())/ (sign.max()-sign.min())) plt.axis('off')plt.show()

對抗性訓練

標準的監督訓練不會指定選擇的函數可以對對抗樣本具有對抗性。作者認為基于FGSM的對抗目標函數是一個有效的正則項：
$J~(θ,x,y)=αJ(θ,x,y)+(1?α)J(θ,x+?sign(?xJ(θ,x,y)),y)\tilde{J}(\bm{\theta},\bm{x},y)=\alpha J(\bm{\theta},\bm{x},y)+(1-\alpha)J(\bm{\theta},\bm{x}+\epsilon sign(\nabla_{\bm{x}}J(\bm{\theta},\bm{x},y)),y)$
作者的默認參數為 $α=0.5\alpha=0.5$ ,此方法可以持續更新所提供的對抗樣本，來對抗現有版本的模型。

總結

以上是生活随笔為你收集整理的利用FGSM实现对抗样本攻击的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

样本
FGSM

上一篇： eclipse工具性能优化方法
下一篇：简单循迹小车实验心得_循迹小车实训报告