摘要
针对车载环境感知场景中的目标检测系统,提出了一种针对目标检测器的对抗样本生成方法。该方法能够实现对目标检测器的白盒对抗攻击,包括目标隐身攻击和目标定向攻击。在Rail数据集和Cityscapes数据集中进行测试,测试结果验证了所提方法对YOLO目标检测器对抗攻击的有效性。
随着深度学习和卷积神经网络(convolutional neural network,CNN)的不断发展,通过CNN解决诸如图像分
研究表明,对抗样本的存在对深度学习造成较大的威胁,即通过对输入图像施加人眼不可察觉的细微扰动,可以使深度神经网络以较高的置信度输出任意想要的分类,这样的输入称为对抗样本。Szegedy
随着深度学习网络应用场景的不断拓展,针对目标检测器的对抗样本生成方
基于分类器网络的对抗样本生成方法不能有效攻击目标检测器,现有攻击目标检测器的对抗样本生成方法仅针对无目标攻击,攻击方式和效果有限。因此,针对YOLO目标检测器的对抗样本生成问题,提出了目标隐身攻击和目标定向攻击2种对抗样本生成方法。
首先,通过目标检测网络的训练参数得到网络输出信息,包括目标所在的包围框和对应的类别置信度;然后,设计一种损失函数,用于对抗样本所需梯度信息的生成;最后,通过线性化梯度信息获取针对目标检测网络的对抗样本。对抗样本生成流程如

图1 对抗样本生成流程
Fig.1 Main framework of generating adversarial examples
对抗样本生成算法的具体实施过程如下:
(1)对抗样本初始化,将原始图像作为对抗样本的初始图像。
(2)将初始化的对抗样本输入目标检测器YOLO,得到目标的位置和置信度。
(3)损失函数设计。
(4)根据构建的损失函数计算对抗样本相应梯度。
(5)通过反向传播算法更新对抗样本。
(6)判断是否达到设置的迭代次数,若是则输出对抗样本,若不是则返回(2)进行下一次迭代。
目标隐身攻击具体表现为:YOLO目标检测器不能检测出图像中真实存在的目标,也就是目标在YOLO目标检测器下处于隐身状态。在针对目标隐身攻击的对抗样本生成方法中,需要寻找能够使目标类别的置信度最小的扰动,如下所示:
(1) |
式中:为损失函数;为目标检测器;为原始图像;为需要计算的对抗扰动;为模型参数;为模型对图像的预测值,在本研究中指的是目标类别的置信度。
通过损失函数的设计降低真实目标类别的置信度,生成隐身攻击所需的对抗样本梯度信息。设计的损失函数如下所示:
(2) |
式中:为由目标检测器计算得到的目标个数;为目标类别的置信度。
范数约束下的对抗扰动为
(3) |
范数约束下的对抗扰动为
(4) |
对抗样本的计算式为
(5) |
目标定向攻击具体表现为:原始类别为“car”的目标,YOLO目标检测器错误地将其识别为类别“bus”。在针对目标定向攻击的对抗样本生成方法中,需要寻找能够使目标类别的置信度最小的扰动,如下所示:
(6) |
式中:为定向识别的目标类别置信度。
在针对目标定向攻击的对抗样本生成方法中,通过损失函数的设计降低被攻击目标类别的置信度,提高攻击定向目标类别的置信度,进而生成目标定向攻击中的对抗样本梯度。
通过损失函数的设计降低真实目标类别的置信度,进而生成目标攻击所需的对抗样本梯度信息。损失函数如下所示:
(7) |
范数约束下的对抗扰动为
(8) |
范数约束下的对抗扰动为
(9) |
对抗样本的计算式为
(10) |
对抗样本生成算法的伪代码如

图2 对抗样本生成算法的伪代码
Fig.2 Pseudo code for adversarial example generation algorithm
为验证数据的有效性,收集了Rail数据集和Cityscapes数据
目标隐身攻击下的原始图像和对抗样本如

图3 原始图像和目标隐身攻击下的对抗样本
Fig.3 Original image and adversarial example under object invisible attacks
平均准确率(mean average precision,αmAP)指标通常被用作目标检测数据集的评价指标。为了进一步评估对抗效果,利用平均准确率指标进行2种数据集的效果验证。实验中交并比(intersection over union,IoU)阈值设置为0.5,目标置信度设置为0.5。同时,改写了攻击分类器的对抗样本算法CI-FGS

图4 基于目标隐身攻击对抗样本的平均准确率
Fig.4 Mean average precision of adversarial example under object invisible attacks
为了综合评估本方法的攻击效果,利用峰值信噪比(peak signal-to-noise ratio, βPSNR)和结构相似性(structural similarity, γSSIM)指标进行图像相似度比较,如

图5 原始图像与对抗样本的峰值信噪比
Fig.5 Peak signal-to-noise ratio of original image and adversarial example

图6 原始图像与对抗样本的结构相似性
Fig.6 Structural similarity of original image and adversarial example
由
原始图像和目标定向攻击下对抗样本图像如

图7 原始图像和目标定向攻击下的对抗样本
Fig.7 Original image and adversarial example under object targeted mis-detectable attacks
本方法对抗样本目标定向攻击是将原本类别为“car”的目标识别成了“bus”,因此利用类别“car”的识别召回率(recall rate, rRR)和类别“bus”的识别准确率(precision rate, pPR)指标进行对抗样本攻击效果的验证,如

图8 基于目标定向攻击的对抗样本召回率
Fig.8 Recall rate of adversarial example under object targeted mis-detectable attacks

图9 基于目标定向攻击的对抗样本准确率
Fig.9 Precision rate of adversarial example under object targeted mis-detectable attacks
从图

图10 原始图像与对抗样本的峰值信噪比
Fig.10 Peak signal-to-noise ratio of original image and adversarial example

图11 原始图像与对抗样本的结构相似性
Fig.11 Structural similarity of original image and adversarial example
针对YOLO目标检测器,提出了攻击效果更加全面的对抗样本生成方法。通过获取目标检测器的网络结构,设计对抗样本的损失函数,然后通过所提出的对抗样本生成方法获取对抗样本。在Rail数据集和Cityscapes数据集上进行了验证,表明该方法对YOLOv3目标检测器具有较高的攻击率,并且该方法能够实现目标隐身攻击和目标定向攻击。
作者贡献声明
黄世泽:提出对抗样本生成研究方案,最终版本修订。
张肇鑫:具体程序设计实现。
董德存:基于车载环境感知的可靠性提出研究思路。
秦晋哲:算法的验证和对比。
参考文献
HUANG Shize, ZHAI Yachan, ZHANG Miaomiao, et al. Arc detection and recognition in pantograph-catenary system based on convolutional neural network[J]. Information Sciences, 2019, 501:363. [百度学术]
黄世泽,杨玲玉,陶婷,等. 基于实例分割的有轨电车障碍物入侵检测及轨道识别方法[J]. 上海公路,2021(2):89. [百度学术]
HUANG Shize, YANG Lingyu, TAO Ting, et al. A method of tram obstacle intrusion detection and track recognition based on instance segmentation[J]. Shanghai Highways,2021(2):89. [百度学术]
黄世泽,陈威,张帆,等. 基于弗雷歇距离的道岔故障诊断方法[J]. 同济大学学报(自然科学版),2018,46(12):1690. [百度学术]
HUANG Shize, CHEN Wei, ZHANG Fan, et al. Method of turnout fault diagnosis based on Fréchet distance[J]. Journal of Tongji University (Natural Science),2018,46(12):1690. [百度学术]
TAO Ting, DONG Decun, HUANG Shize, et al. Gap detection of switch machines in complex environment based on object detection and image processing[J]. Journal of Transportation Engineering, Part A: Systems,2020,146(8): 04020083. [百度学术]
HUANG Shize, YANG Lingyu, ZHANG Fan, et al. Turnout fault diagnosis based on CNNs with self-generated samples[J]. Journal of Transportation Engineering, Part A:Systems,2020,146(9):1. [百度学术]
SZEGEDY C, ZAREMBAW, SUTSKEVER I, et al. Intriguing properties of neural networks[J/OL]. [2021-12-21]. https://arxiv.org/abs/1312.6199. [百度学术]
MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:2574-2582. [百度学术]
GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[J/OL]. [2021-12-20]. https://arxiv.org/abs/1412.6572. [百度学术]
PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//Proceedings of 2016 IEEE European Symposium on Security and Privacy(EuroS&P). Los Alamitos: IEEE Computer Society, 2016:372-387. [百度学术]
CAELINI N, WAGNER D. Towards evaluating the robustness of neural networks[C]//Processing of the 2017 IEEE Symposium on Security and Privacy (SP). Los Alamitos: IEEE Computer Society, 2017:39-57. [百度学术]
KURAKIN A, GOODFELLOW I J, BENGIO S. Adversarial examples in the physical world[J/OL]. [2021-07-08]. https://arxiv.org/abs/1607.02533. [百度学术]
张华,高浩然,杨兴国,等. TargetedFool:一种实现有目标攻击的算法[J].西安电子科技大学学报,2021,48(1):149. [百度学术]
ZHANG Hua, GAO Haoran, YANG Xingguo, et al. TargetedFool: an algorithm for achieving targeted attacks[J]. Journal of Xidian University,2021,48(1):149. [百度学术]
陈晋音,陈治清,郑海斌,等. 基于PSO的路牌识别模型黑盒对抗攻击方法[J]. 软件学报,2020,31(9):2785. [百度学术]
CHEN Jinyin, CHEN Zhiqing, ZHENG Haibin, et al. Black-box physical attack against road sign recognition model via PSO[J]. Journal of Software,2020,31(9):2785. [百度学术]
陈晋音,沈诗婧,苏蒙蒙,等. 车牌识别系统的黑盒对抗攻击[J]. 自动化学报,2021,47(1):121. [百度学术]
CHEN Jinyin, SHEN Shijing, SU Mengmeng, et al. Black-box adversarial attack on license plate recognition system[J]. Acta Automatica Sinica,2021,47(1):121. [百度学术]
马玉琨,毋立芳,简萌,等. 一种面向人脸活体检测的对抗样本生成算法[J]. 软件学报, 2019,30(2):469. [百度学术]
MA Yukun, WU Lifang, JIAN Meng, et al. Approach to generate adversarial examples for face-spoofing detection[J]. Journal of Software,2019,30(2):469. [百度学术]
张翰韬. 面向图像目标检测的对抗攻击[D]. 合肥:中国科学技术大学,2020. [百度学术]
ZHANG Hantao. Adversarial attack on image object detection[D]. Hefei: University of Science and Technology of China,2020. [百度学术]
刘嘉阳. 针对图像分类的对抗样本防御方法研究[D]. 合肥:中国科学技术大学,2020. [百度学术]
LIU Jiayang. Research on defense against adversarial examples for image classification[D]. Hefei: University of Science and Technology of China,2020. [百度学术]
XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Adversarial examples for semantic segmentation and object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Los Alamitos: IEEE Computer Society, 2017:1378-1387. [百度学术]
WEI Xingxing, LIANG Siyuan, CHEN Ning, et al. Transferable adversarial attacks for image and video object detection[J/OL]. [2021-11-30]. https://arxiv.org/abs/1811.12641. [百度学术]
WANG Yutong, WANG Kufeng, ZHU Zhanxing, et al. Adversarial attacks on faster R-CNN object detector[J]. Neurocomputing,2020,382:87. [百度学术]
HUANG Shize, LIU Xiaowen, YANG Xiaolu, et al. Two improved methods of generating adversarial examples against faster R-CNNs for tram environment perception systems[J]. Complexity,2020,2020:6814263. [百度学术]
XIAO Yatie, PUN Chi-Man, LIU Bo. Fooling deep neural detection networks with adaptive object-oriented adversarial perturbation[J]. Pattern Recognition,2021,115:107903. [百度学术]
WANG Yajie, TAN Yu-an, ZHANG Wenjiao, et al. An adversarial attack on DNN-based black-box object detectors[J]. Journal of Network and Computer Applications,2020,161:102634. [百度学术]
HUANG Shize, LIU Xiaowen, YANG Xiaolu, et al. An improved ShapeShifter method of generating adversarial examples for physical attacks on stop signs against faster R-CNNs[J]. Computers & Security,2021,104:102120. [百度学术]
CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos: IEEE Computer Society, 2016:3213-3223. [百度学术]
XIAO Yatie, PUN Chi-Man. Improving adversarial attacks on deep neural networks via constricted gradient-based perturbations[J]. Information Sciences, 2021,571:104. [百度学术]
XIAO Yatie, PUN Chi-Man, LIU Bo. Adversarial example generation with adaptive gradient search for single and ensemble deep neural network[J]. Information Sciences, 2020,528:147. [百度学术]