SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing

Seman ticAdv: Generating Adv ersarial Examples via A ttribute-conditioned Image Editing Haonan Qiu ∗ 1 , Chao wei Xiao ∗ 2 , Lei Y ang ∗ 3 , Xinc hen Y an 2 , 5 † , Honglak Lee 2 , and Bo Li 4 1 The Chine se Univ ersity of Hong Kong, Shenzhen 2 Univ ersity of Michigan, Ann Arb or 3 The Chinese Universit y of Hong Kong 4 Univ ersity of Illinois Urbana-Champaign 5 Ub er A TG, San F rancisco Abstract. Deep neural net works (DNNs) hav e ac hieved great successes in v arious vision applications due to their strong expressiv e pow er. Ho w- ev er, recen t studies hav e shown that DNNs are vulnerable to adversarial examples which are manipulated instances targeting to mislead DNNs to mak e incorrect predictions. Curren tly , most such adversarial examples try to guarantee “subtle perturbation” b y limiting the L p norm of the p er- turbation. In this paper, we prop ose SemanticA dv to generate a new type of semantic al ly r e alistic adv ersarial examples via attribute-conditioned image editing. Compared to existing metho ds, our SemanticA dv enables ﬁne-grained analysis and ev aluation of DNNs with input v ariations in the attribute space. W e conduct comprehensive exp erimen ts to show that our adversarial examples not only exhibit semantically meaningful app earances but also achiev e high targeted attack success rates under b oth whiteb o x and blac kb o x settings. Moreo v er, we sho w that the existing pixel-based and attribute-based defense methods fail to defend against SemanticA dv . W e demonstrate the applicability of SemanticA dv on both face recognition and general street-view images to sho w its generalization. Suc h non- L p b ounded adv ersarial examples with con trolled attribute ma- nipulation can shed light on further understanding about vulnerabilities of DNNs as w ell as nov el defense approaches. 1 In tro duction Deep neural netw orks (DNNs) ha ve demonstrated great successes in adv ancing the state-of-the-art p erformance in v arious vision tasks [ 36 , 61 , 64 , 23 , 59 , 41 , 77 , 11 ] and ha ve been widely used in man y safet y-critical applications such as face veriﬁcation and autonomous driving [ 79 ]. A t the same time, several studies [ 65 , 21 , 45 , 51 , 10 , 71 , 72 , 70 ] ha ve revealed the vulnerablity of DNNs against input v ariations. F or example, carefully crafted L p b ounded p erturbations added to the pristine input images can in tro duce arbitrary prediction errors during testing time. While being visually imp erceptible, L p b ounded adversarial attacks hav e certain limitations as they ∗ Alphab etical ordering; The ﬁrst three authors contributed equally . † W ork partially done as a PhD student at Universit y of Michigan. 2 H. Qiu et al. +blonde hair A dversarial Image S ynthesized Image T arget Image Mr. Bob Mr. Bob Attribute-conditioned Image Gener ator Identity V erification Original Image Miss Alice Reconstruction via Gener ation Original Attribute A ugmented Attribute Attribute-conditioned Image Editing via Gener ation F eature-map Interpolation black hair blonde hair mouth closed mouth slightly open clear road moving car Fig. 1: Pip eline of SemanticA dv Left: Each ro w shows a pair of images diﬀer in only one seman tic asp ect. One of them is sampled from the ground-truth dataset, while the other one is created by our conditional image generator, which is adv ersary to the recognition mo del (e.g., face iden tiﬁcation netw ork and seman tic segmen tation netw ork). Righ t: Overview of the prop osed attribute-conditioned SemanticA dv against the face identit y veriﬁcation mo del only capture the v ariations in the raw pixel space and cannot guarantee the seman tic realism for the generated instances. Recen t w orks [ 72 , 30 , 69 ] hav e shown the limitations of only measuring and ev aluating the L p b ounded p erturbation (e.g., cannot handle v ariations in ligh ting conditions). Therefore, understanding the failure mo des of deep neural netw orks b ey ond ra w pixel v ariations including seman tic p erturbations requires further understanding and exploration. In this work, we focus on studying ho w DNNs resp ond to wards seman tically meaningful p erturbations in the visual attribute space. In the visual recognition literature, visual attributes [ 19 , 37 , 52 ] are properties observ able in images that ha ve h uman-designated properties (e.g., black hair and blonde hair ). As illustrated in Figure 1 (left), given an input image with kno wn attributes, we w ould like to craft seman tically meaningful (attribute-conditioned) adv ersarial examples via image editing along a single attribute or a subset of attributes while keeping the rest unchanged. Compared to traditional L p b ounded adversarial perturbations or seman tic p erturbations on global color and texture [ 5 ], suc h attribute-based image editing enables the users to conduct a ﬁne-grained analysis and ev aluation of the DNN models through removing one or a set of visual asp ects or adding one ob ject into the scene. W e b eliev e our attribute-conditioned image editing is a natural wa y of introducing semantic p erturbations, and it preserves clear in terpretability as: w earing a new pair of glasses or having the hair dyed with a diﬀeren t color. T o facilitate the generation of seman tic adversarial p erturbations along a single attribute dimension, we tak e adv an tage of the disentangled represen tation in deep image generative mo dels [ 55 , 31 , 6 , 75 , 12 , 3 , 76 , 28 ]. Such disen tangled representation Seman ticAdv 3 allo ws us to explore the v ariations for a sp eciﬁc semantic factor while keeping the other factors unc hanged. As illustrated in Figure 1 (righ t), we ﬁrst leverage an attribute-conditioned image editing mo del [ 12 ] to construct a new instance whic h is very similar to the source except one semantic asp ect (the source image is giv en as input). Giv en suc h pair of images, we synthesize the adv ersarial example b y interpolating b et w een the pair of images in the fe atur e-map sp ac e . As the in terp olation is constrained b y the image pairs, the app earance of the resulting seman tic adversarial example resem bles b oth of them. T o v alidate the eﬀectiveness of our prop osed SemanticA dv b y attribute- conditioned image editing, w e consider tw o real-world tasks, including face v eriﬁcation and landmark detection. W e conduct b oth qualitativ e and quantita- tiv e ev aluations on CelebA dataset [ 40 ]. The results show that our SemanticA dv not only achiev es high targeted attac k success rate and also preserves the se- man tic meaning of the corresp onding input images. T o further demonstrate the applicabilit y of our SemanticA dv b ey ond face domain, we extend the framew ork to generate adv ersarial street-view images. W e treat semantic lay outs as input attributes and use the lay out-conditioned image editing model [ 24 ] pre-trained on Cit yscap e dataset [ 14 ]. Our results show that a well-trained semantic seg- men tation mo del can b e successfully attack ed to neglect the p edestrian if w e insert another ob ject by the side using our image editing mo del. In addition, w e show that existing adv ersarial training-based defense metho d is less eﬀective against our attack metho d, which motiv ates further defense strategies against suc h semantic adv ersarial examples. Our con tributions are summarized as follo ws: (1) W e prop ose a nov el metho d SemanticA dv to generate semantically meaningful adversarial examples via attribute-conditioned image editing based on feature-space in terp olation. Com- pared to existing adversarial attacks, our method enables ﬁne-grained attribute analysis as w ell as further ev aluation of vulnerabilities for DNN mo dels. Suc h seman tic adversarial examples also provide explainable analysis for diﬀerent attributes in terms of their robustness and editing ﬂexibility . (2) W e conduct extensiv e exp erimen ts and sho w that the prop osed feature-space in terp olation strategy can generate high quality attribute-conditioned adversarial examples more eﬀectively than the simple attribute-space interpolation. Additionally , our SemanticA dv exhibits high attack transferabilit y as w ell as 67.7% query-free blac k-b o x attack success rate on a real-world face v eriﬁcation platform. (3) W e empirically sho w that, compared to L p attac ks, the existing p er-pixel based as w ell as attribute-based defense metho ds fail to defend against our SemanticA dv , whic h indicates that such semantic adversarial examples identify certain unex- plored vulnerable landscap e of DNNs. (4) T o demonstrate the applicability and generalization of SemanticA dv b ey ond the face recognition domain, w e extend the framework to generate adv ersarial street-view images that fo ol seman tic segmen tation mo dels eﬀectiv ely . 2 Related W ork Semantic image e diting. Semantic image synthesis and manipulation is a p op- ular researc h topic in mac hine learning, graphics and vision. Thanks to recent 4 H. Qiu et al. adv ances in deep generative mo dels [ 34 , 20 , 50 ] and the empirical analysis of deep classiﬁcation netw orks [ 36 , 61 , 64 ], past few years hav e witnessed tremendous breakthroughs to wards high-ﬁdelity pure image generation [ 55 , 31 , 6 ], attribute- to-image generation [ 75 , 12 ], text-to-image generation [ 44 , 56 , 49 , 48 , 78 , 28 ], and image-to-image translation [26,81,39,68,24]. A dversarial examples. Generating L p b ounded adversarial p erturbation has b een extensiv ely studied recen tly [ 65 , 21 , 45 , 51 , 10 , 71 ]. T o further explore diverse adv er- sarial attac ks and p oten tially help inspire defense mechanisms, it is imp ortan t to generate the so-called “unrestricted” adv ersarial examples which contain un- restricted magnitude of p erturbation while still preserv e p erceptual realism [ 7 ]. Recen tly , [ 72 , 18 ] prop ose to spatially transform the image patches instead of adding pixel-wise perturbation, while such spatial transformation does not con- sider semantic information. Our prop osed semanticA dv fo cuses on generating unrestricted p erturbation with semantically meaningful patterns guided by visual attributes. Relev an t to our work, [ 62 ] prop osed to synthesize adversarial examples with an unconditional generativ e mo del. [ 5 ] studied seman tic transformation in only the color or texture space. Compared to these works, semanticA dv is able to generate adversarial examples in a controllable fashion using sp eciﬁc visual attributes b y p erforming manipulation in the feature space. W e further analyze the robustness of the recognition system by generating adversarial examples guided b y diﬀerent visual attributes. Concurrent to our work, [ 29 ] proposed to generate seman tic-based attacks against a restricted binary classiﬁer, while our attac k is able to mislead the mo del tow ards arbitrary adversarial targets. They conduct the manipulation within the attribution space which is less ﬂexible and eﬀectiv e than our prop osed feature-space in terp olation. 3 Seman ticAdv 3.1 Problem Deﬁniti on Let M b e a machine learning mo del trained on a dataset D = { ( x , y ) } consisting of image-label pairs, where x ∈ R H × W × D I and y ∈ R D L denote the image and the ground-truth lab el, respectively . Here, H , W , D I , and D L denote the image heigh t, image width, num b er of image channels, and lab el dimensions, resp ectiv ely . F or eac h image x , our mo del M mak es a prediction ˆ y = M ( x ) ∈ R D L . Giv en a target image-label pair ( x tgt , y tgt ) and y 6 = y tgt , a tr aditional attacker aims to syn thesize adversarial examples x adv b y adding pixel-wise p erturbations to or spatially transforming the original image x suc h that M ( x adv ) = y tgt . In this w ork, we consider a semantic attacker that generates seman tically meaningful p erturbation via attribute-conditioned image editing with a conditional generativ e mo del G . Compared to the traditional attac ker, the prop osed attack metho d generates adversarial examples in a more controllable fashion b y editing a single seman tic asp ect through attribute-conditioned image editing. 3.2 A ttribute-conditioned Image Editing In order to pro duce semantically meaningful p erturbations, w e ﬁrst introduce ho w to synthesize attribute-conditioned images through in terp olation. Seman ticAdv 5 Semantic image e diting. F or simplicity , w e start with the formulation where the input attribute is represen ted as a compact vector. This form ulation can b e directly extended to other input attribute formats including semantic lay outs. Let c ∈ R D C b e an attribute representation reﬂecting the semantic factors (e.g., expression or hair color of a p ortrait image) of image x , where D C indicates the attribute dimension and c i ∈ { 0 , 1 } indicates the existence of i -th attribute. W e are interested in p erforming semantic image editing using the attribute- conditioned image generator G . F or example, giv en a p ortrait image of a girl with black hair and the new attribute blonde hair , our generator is supp osed to syn thesize a new image that turns the girl’s hair color from black to blonde while k eeping the rest of app earance unc hanged. The synthesized image is denoted as x new = G ( x , c new ) where c new ∈ R D C is the new attribute. In the sp ecial case when there is no attribute change ( c = c new ), the generator simply reconstructs the input: x 0 = G ( x , c ) (ideally , we hop e x 0 equals to x ). As our attribute represen tation is disen tangled and the change of attribute v alue is suﬃciently small (e.g., w e only edit a single semantic attribute), our synthesized image x new is exp ected to b e close to the data manifold [ 4 , 57 , 55 ]. In addition, w e can generate man y similar images b y linearly in terp olating b et ween the image pair x and x new in the attribute-space or the feature-space of the image-conditioned generator G , whic h is supp orted b y the previous work [75,55,3] A ttribute-sp ac e interp olation. Given a pair of attributes c and c new , we introduce an in terp olation parameter α ∈ (0 , 1) to generate the augmented attribute v ector c ∗ ∈ R D C (see Eq. 1). Giv en augmented attribute c ∗ and original image x , w e pro duce the image x * b y the generator G through attribute-space in terp olation. x * = G ( x , c * ) c * = α · c + (1 − α ) · c new , where α ∈ [0 , 1] (1) F e atur e-map interp olation. Alternatively , we prop ose to interpolate using the feature map pro duced by the generator G = G dec ◦ G enc . Here, G enc is the enco der mo dule that takes the image as input and outputs the feature map. Similarly , G dec is the deco der module that takes the feature map as input and outputs the syn thesized image. Let f ∗ = G enc ( x , c ) ∈ R H F × W F × C F b e the feature map of an in termediate lay er in the generator, where H F , W F and C F indicate the heigh t, width, and n umber of channels in the feature map. x * = G dec ( f * ) f * = β β β  G enc ( x , c ) + ( 1 − β β β )  G enc ( x , c new ) (2) Compared to the attribute-space interpolation which is parameterized by a scalar α , w e parameterize feature-map in terpolation by a tensor β β β ∈ R H F × W F × C F ( β h,w,k ∈ [0 , 1], where 1 ≤ h ≤ H F , 1 ≤ w ≤ W F , and 1 ≤ k ≤ C F ) with the same shap e as the feature map. Compared to linear interpolation ov er attribute-space, suc h design introduces more ﬂexibility for adversarial attac ks. Empirical results in Section 4.2 show suc h design is critical to maintain b oth attac k success and go od p erceptual quality at the same time. 6 H. Qiu et al. 3.3 Generating Semantically Meaningful Adversarial Examples Existing work obtains the adversarial image x adv b y adding p erturbations or transforming the input image x directly . In con trast, our seman tic attac k metho d requires additional attribute-conditioned image generator G during the adv ersarial image generation through interpolation. As we see in Eq. 3, the ﬁrst term of our ob jective function is the adversarial metric, the second term is a smo othness constrain t to guarantee the perceptual qualit y , and λ is used to con trol the balance b et w een the tw o terms. The adv ersarial metric is minimized once the mo del M has bee n successfully attack ed to wards the target image-label pair ( x tgt , y tgt ). F or iden tify veriﬁcation, y tgt is the iden tity representation of the target image; F or structured prediction tasks in our pap er, y tgt either represen ts certain coordinates (landmark detection) or seman tic lab el maps (seman tic segmentation). x adv = argmin x ∗ L ( x ∗ ) L ( x ∗ ) = L adv ( x ∗ ; M , y tgt ) + λ · L smooth ( x ∗ ) (3) Identity veriﬁc ation. In the identit y veriﬁcation task, tw o images are considered to b e the same iden tity if the corresponding identit y embeddings from the v eriﬁcation mo del M are reasonably close. L adv ( x ∗ ; M , y tgt ) = max { κ, Φ id M ( x ∗ , x tgt ) } (4) As we see in Eq. 4, Φ id M ( · , · ) measures the distance b et ween tw o identit y em b eddings from the mo del M , where the normalized L 2 distance is used in our setting. In addition, w e in tro duce the parameter κ represen ting the constant related to the false p ositiv e rate (FPR) threshold computed from the developmen t set. Structur e d pr e diction. F or structured prediction tasks such as landmark detection and semantic segmentation, we use Houdini ob jective prop osed in [ 13 ] as our adv ersarial metric and select the target landmark (seman tic segmen tation) target as y tgt . As w e see in the equation, Φ M ( · , · ) is a scoring function for eac h image- lab el pair and γ is the threshold. In addition, l ( y ∗ , y tgt ) is task loss decided b y the sp eciﬁc adversarial target, where y ∗ = M ( x ∗ ). L adv ( x ∗ ; M , y tgt ) = P γ ∼N (0 , 1) h Φ M ( x ∗ , y ∗ ) − Φ M ( x ∗ , y tgt ) < γ i · l ( y ∗ , y tgt ) (5) Interp olation smo othness L smo oth . As the tensor to b e interpolated in the feature- map space has far more parameters compared to the attribute itself, w e prop ose to enforce a smoothness constraint on the tensor α used in feature-map in terpolation. As w e see in Eq. 6, the smo othness loss encourages the interpolation tensors to consist of piece-wise constant patc hes spatially , which has b een widely used as a pixel-wise de-noising ob jective for natural image pro cessing [43,27]. L smooth ( β β β ) = H F − 1 X h =1 W F X w =1 k β β β h +1 ,w − β β β h,w k 2 2 + H F X h =1 W F − 1 X w =1 k β β β h,w +1 − β β β h,w k 2 2 (6) Seman ticAdv 7 4 Exp erimen ts In the exp erimen tal section, we mainly fo cus on analyzing the prop osed Semanti- cA dv in attacking state-of-the-art face recognition systems [ 63 , 59 , 80 , 67 ] due to its wide applicability (e.g., identiﬁcation for mobile pa yment) in the real world. W e attac k b oth face veriﬁcation and face landmark detection by generating attribute- conditioned adversarial examples using annotations from CelebA dataset [ 40 ]. In addition, w e extend our attack to urban street scenes with seman tic lab el maps as the condition. W e attack the semantic segmentation model DRN-D-22 [ 77 ] previously trained on Cityscape [ 14 ] by generating adversarial examples with dynamic ob jects manipulated (e.g., insert a car into the scene). The exp erimen tal section is organized as follows. First, w e analyze the quality of generated adversarial examples and qualitativ ely compare our metho d with L p b ounded pixel-wise optimization-based metho ds [ 10 , 16 , 73 ]. Second, w e provide b oth qualitative and quantitativ e results by controlling single semantic attribute. In terms of attack transferabilit y , we ev aluate our prop osed SemanticA dv in v arious settings and further demonstrate the eﬀectiveness of our metho d via query-free black-box attacks against online face veriﬁcation platforms. Third, we compare our metho d with the baseline metho ds against diﬀerent defense metho ds on the face veriﬁcation task. F ourth, w e demonstrate that our SemanticA dv is a general framework by sho wing the results in other tasks including face landmark detection and street-view seman tic segmentation. 4.1 Exp erimen tal Setup F ac e identity veriﬁc ation. W e select ResNet-50 and ResNet-101 [ 23 ] trained on MS-Celeb-1M [ 22 , 15 ] as our face veriﬁcation mo dels. The mo dels are trained using tw o diﬀeren t ob jectiv es, namely , softmax loss [ 63 , 80 ] and cosine loss [ 67 ]. F or simplicity , we use the notation “R-N-S” to indicate the mo del with N - la yer residual blo c ks as backbone trained using softmax loss, while “R-N-C” indicates the same backbone trained using cosine loss. The distance b et ween face features is measured by normalized L2 distance. F or R-101-S mo del, we decide the parameter κ based on the false p ositiv e rate (FPR) for the identit y v eriﬁcation task. F our diﬀeren t FPRs hav e b een used: 10 − 3 (with κ = 1 . 24), 3 × 10 − 4 (with κ = 1 . 05), 10 − 4 (with κ = 0 . 60), and < 10 − 4 (with κ = 0 . 30). The distance metrics and selected thresholds are commonly used when ev aluating the p erformance of face recognition mo del [ 35 , 32 ]. Supplementary provides more details on the p erformance of face recognition mo dels and their corresp onding κ . T o distinguish betw een the FPR we used in generating adversarial examples and the other FPR used in ev aluation, we introduce tw o notations “Generation FPR (G-FPR)” and “T est FPR (T-FPR)”. F or the experiment with query-free blac k-b o x API attacks, we use tw o online face v eriﬁcation services provided by F ace++ [2] and AliY un [1]. Semantic attacks on fac e images. In our exp erimen ts, w e randomly sample 1 , 280 distinct identities form CelebA [ 40 ] and use the StarGAN [ 12 ] for attribute- conditional image editing. In particular, we re-train our mo del on CelebA b y aligning the face landmarks and then resizing images to resolution 112 × 112. W e 8 H. Qiu et al. select 17 identit y-preserving attributes as our analysis, as such attributes mainly reﬂect v ariations in facial expression and hair color. In feature-map in terp olation, to reduce the reconstruction error brought by the generator (e.g., x 6 = G ( x , c )) in practice, w e take one more step to obtain the up dated feature map f 0 = G enc ( x 0 , c ), where x 0 = argmin x 0 kG ( x 0 , c ) − x k . F or eac h distinct identit y pair ( x , x tgt ), w e p erform semanticA dv guided by eac h of the 17 attributes (e.g., we inten tionally add or remo v e one sp eciﬁc attribute while keeping the rest unchanged). In total, for each image x , we generate 17 adv ersarial images with diﬀerent augmented attributes. In the exp erimen ts, w e select a commonly-used pixel-wise adversarial attac k metho d [ 10 ] (referred as CW) as our baseline. Compared to our prop osed metho d, CW do es not require visual attributes as part of the system, as it only generates one adversarial example for each instance. W e refer the corresp onding attac k success rate as the instance-wise success rate in which the attack success rate is calculated for eac h instance. F or each instance with 17 adversarial images using diﬀeren t augmented attributes, if one of the 17 pro duced images can attack successfully , we count the attac k of this instance as one success, vice verse. F ac e landmark dete ction. W e select F ace Alignment Netw ork (F AN) [9] trained on 300W-LP [ 82 ] and ﬁne-tuned on 300-W [ 58 ] for 2D landmark detection. The net work is constructed by stacking Hour-Glass netw orks [ 47 ] with hierarchical blo c k [ 8 ]. Given a face image as input, F AN outputs 2D heatmaps which can b e subsequen tly leveraged to yield 68 2D landmarks. Semantic attacks on str e et-view images. W e select DRN-D-22 [ 77 ] as our seman tic segmen tation mo del and ﬁne-tune the model on image regions with resolution 256 × 256. T o synthesize semantic adv ersarial p erturbations, we consider semantic lab el maps as the input attribute and leverage a generative image manipulation mo del [ 24 ] pre-trained on Cit yScap e [ 14 ] dataset. Given an input semantic lab el map at resolution 256 × 256, we select a target ob ject instance (e.g., a p edestrian) to attack. Then, w e create a manipulated semantic lab el map by inserting another ob ject instance (e.g., a car) in the vicinity of the target ob ject. Similar to the exp erimen ts in the face domain, for b oth seman tic lab el maps, w e use the image manipulation enco der to extract features (with 1 , 024 channels at spatial resolution 16 × 16) and conduct feature-space interpolation. W e synthesize the ﬁnal image b y feeding the in terp olated features to the image manipulation deco der. By searc hing the interpolation co eﬃcien t that maximizes the attack rate, we are able to fo ol the segmen tation model with the syn thesized ﬁnal image. 4.2 SemanticA dv on F ace Identit y V eriﬁcation A ttribute-sp ac e vs. fe atur e-sp ac e interp olation. First, w e qualitativ ely compare the t wo interpolation metho ds and found that both attribute-space and feature-space in terp olation can generate reasonably lo oking samples (see Figure 2) through in terp olation (these are not adversarial examples). How ev er, we found the tw o in terp olation metho ds p erform diﬀerently when we optimize using the adv er- sarial ob jectiv e (Eq. 3). W e measure the attack success rate of attribute-space in terp olation (with G-FPR = T-FPR = 10 − 3 ): 0 . 08% on R-101-S, 0 . 31% on Seman ticAdv 9 Attribute-space Interpolation F eature-space Interpolation +mouth slightly open Original image 0.0 0.2 0.4 0.6 0.8 1.0 +bangs Original image 0.0 0.2 0.4 0.6 0.8 1.0 Original image 0.0 0.2 0.4 0.6 0.8 1.0 -mouth slightly open Original image 0.0 0.2 0.4 0.6 0.8 1.0 +bangs S ynthesized image b y interpolation (with α ) S ynthesized image b y interpolation (with α ) mouth slightly open mouth closed mouth slightly open mouth closed without bangs without bangs with bangs with bangs Fig. 2: Qualitative comparisons b et ween attribute-space and feature-space in- terp olation In our visualization, w e set the interpolation parameter to be 0 . 0 , 0 . 2 , 0 . 4 , 0 . 6 , 0 . 8 , 1 . 0 T able 1: Attac k success rate by selecting attribute or diﬀerent la yer’s feature- map for interpolation on R-101-S(%) using G-FPR = T-FPR = 10 − 3 . Here, f i indicates the feature-map after i -th up-sampling op eration. f − 2 and f − 1 are the ﬁrst and the second feature-maps after the last down-sampling op eration, resp ectiv ely . In terp olation / A ttack Success (%) F eature A ttribute f − 2 f − 1 f 0 f 1 f 2 x adv , G-FPR = 10 − 3 99.38 100.00 100.00 100.00 99.69 0.08 x adv , G-FPR = 10 − 4 59.53 98.44 99.45 97.58 73.52 0.00 R-101-C, and 0 . 16% on b oth R-50-S and R-50-C, whic h consistently fails to attac k the face veriﬁcation mo del. Compared to attribute-space in terp olation, generating adversarial examples with feature-space interpolation pro duces m uch b etter quantitativ e results (see T able 1). W e conjecture that this is b ecause the high dimensional feature space can provide more manipulation freedom. This also explains one p oten tial reason of p oor samples (e.g., blurry with many noticeable artifacts) generated b y the metho d prop osed in [ 29 ]. W e select f 0 , the last conv la yer b efore up-sampling la yer in the generator for feature-space interpolation due to its go od p erformance. Qualitative analysis. Figure 3 (top) shows the generated adversarial images and corresp onding p erturbations against R-101-S of SemanticA dv and CW resp ectiv ely . The text below eac h ﬁgure is the name of an augmented attribute, the sign before the name represents “adding” (in red) or “removing” (in blue) the corresponding attribute from the original image. Figure 3 (bottom) shows the adversarial examples with 17 augmented seman tic attributes, resp ectiv ely . The attribute names are shown in the b ottom. The ﬁrst row contains images generated b y G ( x , c new ) with an augmen ted attribute c new and the second ro w includes the corresp onding adversarial images under feature-space in terp olation. 10 H. Qiu et al. Semantic A dversarial Examples -young -wearing lipsticks +smiling +receding hairline Source Image T arget Image +arched eyebrows A dversarial Examples using CW -bush y eyebrows +rosy cheeks +eyeglasses +bangs +mouth slightly open -smiling Source Image T arget Image +bags under eyes Semantic A dversarial Examples A dversarial Examples using CW +bush y eyebrows -wavy hair -young +eyeglasses -heavy mak eup +rosy cheeks +chubb y -mouth slightly open +blonde hair -wearing lipstick -smiling -arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline -pale skin S ynthesized image with augmented attribute ( ɑ =0) A dversarial image b y optimizing interpolation par ameter ( ɑ =argmin ɑ L(.)) +pale skin +receding hairline +heavy mak eup Source Image T arget Image Semantic A dversarial Examples A dversarial Examples using CW +smiling -bags under eyes +arched eyebrows +pale skin +heavy mak eup +bags under eyes -blonde hair +smiling +rosy cheeks Source Image T arget Image Gener ation via attribute-conditioned image editing +bush y eyebrows +wavy hair -young +eyeglasses -heavy mak eup -rosy cheeks +chubb y +mouth slightly open -blonde hair -wearing lipstick -smiling -arched eyebrows +bangs -wearing earrings +bags under eyes +receding hairline S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) +pale skin Fig. 3: T op: Qualitative comparisons b et ween our prop osed SemanticA dv and pixel-wise adversarial examples generated by CW [ 10 ]. Along with the adversarial examples, we also provide the corresp onding p erturbations (residual) on the right. P erturbations generated by our SemanticA dv (G-FPR = 10 − 3 ) are unrestricted with seman tically meaningful patterns. Bottom: Qualitativ e analysis on single- attribute adv ersarial attack (G-FPR = 10 − 3 ). More results are sho wn in the supplemen tary It sho ws that our SemanticA dv can generate examples with reasonably-looking app earance guided by the corresp onding attribute. In particular, SemanticA dv is able to generate p erturbations on the corresp onding regions correlated with the augmen ted attribute, while the p erturbations of CW ha ve no sp eciﬁc pattern and are ev enly distributed across the image. T o further measure the pe rceptual qualit y of the adv ersarial images generated b y SemanticA dv in the most strict settings (G-FPR < 10 − 4 ), we conduct a user study using Amazon Mechanical T urk (AMT). In total, we collect 2 , 620 annotations from 77 participants. In 39 . 14 ± 1 . 96% (close to random guess 50%) of trials, the adversarial images generated by our SemanticA dv are selected as reasonably-lo oking images, while 30 . 27 ± 1 . 96% of trials b y CW are selected as reasonably-lo oking. It indicates that SemanticA dv can generate more p erceptu- ally plausible adversarial examples compared with CW under the most strict setting (G-FPR < 10 − 4 ). The corresp onding images are shown in supplemen tary materials. Seman ticAdv 11 Pale Skin Eyeglasses Receding Hairline Rosy Cheeks Blond Hair Chubby Young Bushy Eyebrows Wavy Hair Smiling Wearing Lipstick Heavy Makeup Bags Under Eyes Wearing Earrings Bangs Arched Eyebrows Mouth Slightly Open 0 25 50 75 100 Attack Success Rate (%) Res-101 (softmax loss) Pale Skin Eyeglasses Receding Hairline Rosy Cheeks Blond Hair Chubby Young Smiling Wavy Hair Heavy Mask Bushy Eyebrows Bags Under Eyes Wearing Lipstick Arched Eyebrows Wearring Earrings Bangs Mouth Slightly Open 0 25 50 75 100 Res-101 (cos loss) Fig. 4: Quantitativ e analysis on the attac k success rate with diﬀerent single- attribute attac ks. In each ﬁgure, w e show the results corresp ond to a larger FPR (G-FPR = T-FPR = 10 − 3 ) in skyblue and the results correspond to a smaller FPR (G-FPR = T-FPR = 10 − 4 ) in blue , resp ectiv ely Single attribute analysis. One of the key adv antages of our SemanticA dv is that w e can generate adversarial p erturbations in a more controllable fashion guided b y the selected semantic attribute. This allows analyzing the robustness of a recognition system against diﬀeren t types of semantic attacks. W e group the adv ersarial examples by augmented attributes in v arious settings. In Figure 4, w e present the attack success rate against tw o face veriﬁcation mo dels, namely , R-101-S and R-101-C, using diﬀerent attributes. W e highlight the bar with light blue for G-FPR = 10 − 3 and blue for G-FPR = 10 − 4 , resp ectiv ely . As sho wn in Figure 4, with a larger T-FPR = 10 − 3 , our SemanticA dv can ac hieve almost 100% attac k success rate across diﬀerent attributes. With a smaller T-FPR = 10 − 4 , we observe that SemanticA dv guided b y some attributes such as Mouth Slightly Open and Arched Eyebrows ac hieve less than 50% attack success rate, while other attributes such as Pale Skin and Eyeglasses are relativ ely less aﬀected. In summary , the ab o v e exp erimen ts indicate that SemanticA dv guided b y attributes describing the lo cal shape (e.g., mouth, earrings) achiev e a relativ ely lo wer attac k success rate compared to attributes relev ant to the color (e.g., hair color) or en tire face region (e.g., skin). This suggests that the face v eriﬁcation mo dels used in our exp erimen ts are more robustly trained in terms of detecting lo cal shap es compared to colors. In practice, we ha ve the ﬂexibility to select attributes for attacking an image based on the p erceptual quality and attack success rate. T r ansfer ability analysis. T o generate adversarial examples under blac k-box setting, w e analyze the transferabilit y of SemanticA dv in v arious settings. F or eac h mo del with diﬀeren t FPRs, we select the successfully attac ked adv ersarial examples from Section 4.1 to construct our ev aluation dataset and ev aluate these adversarial samples across diﬀerent mo dels. T able 2(a) illustrates the transferabilit y of SemanticA dv among diﬀerent mo dels by using the same FPRs (G-FPR = T-FPR = 10 − 3 ). T able 2(b) illustrates the result with diﬀerent FPRs for generation and ev aluation (G-FPR = 10 − 4 and T-FPR = 10 − 3 ). As sho wn in T able 2(a), adv ersarial examples generated against mo dels trained with softmax loss exhibit certain transferability compared to models trained with cosine loss. W e conduct the same exp erimen t b y generating adversarial examples with CW and found it 12 H. Qiu et al. T able 2: T ransferabilit y of SemanticA dv : cell ( i, j ) sho ws attack success rate of adv ersarial examples generated against j -th model and ev aluate on i -th model. Results of CW are listed in brack ets. Left: Results generated with G-FPR = 10 − 3 and T-FPR = 10 − 3 ; Right: Results generated with G-FPR = 10 − 4 and T-FPR = 10 − 3 M test / M opt R-50-S R-101-S R-50-C R-101-C R-50-S 1.000 (1.000) 0.108 (0.032) 0.023 (0.007) 0.018 (0.005) R-101-S 0.169 (0.029) 1.000 (1.000) 0.030 (0.009) 0.032 (0.011) R-50-C 0.166 (0.054) 0.202 (0.079) 1.000 (1.000) 0.048 (0.020) R-101-C 0.120 (0.034) 0.236 (0.080) 0.040 (0.017) 1.000 (1.000) (a) M test / M opt R-50-S R-101-S R-50-S 1.000 (1.000) 0.862 (0.530) R-101-S 0.874 (0.422) 1.000 (1.000) R-50-C 0.693 (0.347) 0.837 (0.579) R-101-C 0.617 (0.218) 0.888 (0.617) (b) has w eaker transferabilit y compared to our SemanticA dv (results in brack ets of T able 2). As T able 2(b) illustrates, the adv ersarial examples generated against the model with smaller G-FPR = 10 − 4 exhibit strong attac k success rate when ev aluating the mo del with larger T-FPR = 10 − 3 . Esp ecially , we found the adversarial examples generated against R-101-S ha ve the best attack performance on other mo dels. These ﬁndings motiv ate the analysis of the query-free blac k-b o x API attac k detailed in the following paragraph. Query-fr e e black-b ox API attack. In this exp erimen t, we generate adversarial examples against R-101-S with G-FPR = 10 − 3 ( κ = 1 . 24), G-FPR = 10 − 4 ( κ = 0 . 60), and G-FPR < 10 − 4 ( κ = 0 . 30), resp ectiv ely . W e ev aluate our algorithm on tw o industry lev el face veriﬁcation APIs, namely , F ace++ and AliY un. Since attac k transferability has nev er b een explored in concurrent work that generates seman tic adversarial examples, w e use L p b ounded pixel-wise metho ds (CW [ 10 ], MI-F GSM[ 16 ], M-DI 2 -F GSM[ 73 ]) as our baselines. W e also introduce a muc h strong baseline by ﬁrst p erforming attribute-conditioned image editing and running CW attac k on the editted images, whic h we refer as and StarGAN+CW. Compared to CW, the latter tw o devise certain techniques to improv e their transferabilit y . W e adopt the ensem ble version of MI-FGSM[ 16 ] following the original pap er. As sho wn in T able 3, our prop osed SemanticA dv achiev es a m uch higher attac k success rate than the baselines in b oth APIs under all FPR thresholds (e.g., our adversarial examples generated with G-FPR < 10 − 4 ac hieves 67 . 69% attac k success rate on F ace++ platform with T-FPR = 10 − 3 ). In addition, w e found that low er G-FPR can achiev e higher attack success rate on b oth APIs within the same T-FPR (see our supplemen tary material for more details). SemanticA dv against defense metho ds. W e ev aluate the strength of the pro- p osed attack by testing against ﬁv e existing defense metho ds, namely , Feature squeezing [ 74 ], Blurring [ 38 ], JPEG [ 17 ], AMI [ 66 ] and adversarial training [ 42 ]. Figure 5 illustrates SemanticA dv is more robust against the pixel-wise defense metho ds comparing with CW. The same G-FPR and T-FPR are used for ev alua- tion. Both SemanticA dv and CW ac hieve a high attac k success rate when T-FPR = 10 − 3 , while SemanticA dv marginally outp erforms CW when T-FPR go es do wn to 10 − 4 . While defense metho ds ha ve prov en to b e eﬀectiv e against CW attacks Seman ticAdv 13 T able 3: Quan titative analysis on query-free black-box attack. W e use ResNet-101 optimized with softmax loss for ev aluation and rep ort the attack success rate(%) on t wo online face veriﬁcation platforms. Note that for PGD-based attacks, we adopt MI-F GSM (  = 8) in [ 16 ] and M-DI 2 -F GSM (  = 8) in [ 73 ], respectively . F or CW, StarGAN+CW and SemanticA dv , we generate adversarial samples with G-FPR < 10 − 4 API name F ace++ AliY un A ttack er / Metric T-FPR = 10 − 3 T-FPR = 10 − 4 T-FPR = 10 − 3 T-FPR = 10 − 4 CW [10] 37.24 20.41 18.00 9.50 StarGAN+CW 47.45 26.02 20.00 8.50 MI-F GSM [16] 53.89 30.57 29.50 17.50 M-DI 2 -F GSM [73] 56.12 33.67 30.00 18.00 SemanticA dv 67.69 48.21 36.50 19.50 1 0 − 4 3 * 1 0 − 4 1 0 − 3 False Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0 Attack Success Rate ResNet-50 (softmax loss) 1 0 − 4 3 * 1 0 − 4 1 0 − 3 False Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0 ResNet-101 (softmax loss) 1 0 − 4 3 * 1 0 − 4 1 0 − 3 False Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0 ResNet-50 (cos loss) 1 0 − 4 3 * 1 0 − 4 1 0 − 3 False Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0 ResNet-101 (cos loss) SemanticAdv vs. JPEG SemanticAdv vs. Blur SemanticAdv vs. Squeeze CW vs. JPEG CW vs. Blur CW vs. Squeeze Fig. 5: Quantitativ e analysis on attacking several defense metho ds including JPEG [17], Blurring [38], and Feature Squeezing [74] on classiﬁers trained with ImageNet [ 36 ], our results indicate that these methods are still vulnerable in the face v eriﬁcation system with small G-FPR. W e further ev aluate SemanticA dv on attribute-based defense method AMI [ 66 ] b y constructing adversarial examples for the pretrained VGG-F ace [ 53 ] in a blac k-b o x manner. F rom adversarial examples generated by R-101-S, w e use fc7 as the embedding and select the images with normalized L2 distance (to the corresp onding b enign images) b ey ond the threshold deﬁned previously . With the b enign and adversarial examples, we ﬁrst extract attribute witnesses with our aligned face images and then leverage them to build a attribute-steered model. When misclassifying 10% b enign inputs into adversarial images, it only correctly iden tiﬁes 8% adversarial images from SemanticA dv and 12% from CW. Moreo ver, w e ev aluate SemanticA dv on existing adversarial training based defense (the detailed setting is presented in supplemen tary materials). W e ﬁnd that accuracy of adversarial training based defense metho d is 10% against the adv ersarial examples generated by SemanticA dv , while is 46.7% against the adv ersarial examples generated by PGD [ 42 ]. It indicates that existing adversarial training based defense metho d is less eﬀectiv e against SemanticA dv , whic h further demonstrates that our SemanticA dv iden tiﬁes an unexplored research area beyond previous L p -based ones. 14 H. Qiu et al. T ask: “out-of-region” attack T ask: “rotating eyes” attack detection on original image detection on adversarial image reference: original image reference: adversarial image +pale skin detection on original image detection on adversarial image reference: original image reference: adversarial image +bangs detection on original image detection on adversarial image reference: rotated image reference: original image reference: adversarial image +pale skin +bangs detection on original image detection on adversarial image reference: rotated image reference: original image reference: adversarial image Fig. 6: Qualitativ e results on attacking face landmark detection model GT label maps Pred label maps Source Image Source La yout (prediction) S ynthesized Image with A ugmented La yout ( ɑ =0) A dversarial Image b y Optimization ( ɑ =argmin ɑ L(.)) Source La yout (ground-truth) T arget La yout (ground-truth) Manipulated Image Manipulated La yout (prediction) Manipulated La yout (ground-truth) Source Image Source La yout (prediction) Source La yout (ground-truth) T arget La yout (ground-truth) S ynthesized Image with A ugmented La yout ( ɑ =0) Manipulated Image Manipulated La yout (prediction) Manipulated La yout (ground-truth) T ask: pedestrian removal attack T ask: cyclist → pedestrian attack A dversarial Image b y Optimization ( ɑ =argmin ɑ L(.)) Fig. 7: Qualitative results on attacking street-view semantic segmen tation mo del 4.3 SemanticA dv on F ace Landmark Detection W e ev aluate the eﬀectiv eness of SemanticA dv on face landmark detection under t wo attack tasks, namely , “Rotating Ey es” and “Out of Region”. F or the “Rotating Ey es” task, w e rotate the co ordinates of the eyes in the image counter-clockwise b y 90 ◦ . F or the “Out of Region” task, w e set a target b ounding box and attempt to push all p oin ts out of the b o x. Figure 6 indicates that our metho d is applicable to attac k landmark detection mo dels. 4.4 SemanticA dv on Street-view Seman tic Segmentation W e further demonstrate the applicability of our SemanticA dv b ey ond the face domain b y generating adversarial perturbations on street-view images. Figure 7 illustrates the adversarial examples on semantic segmentation. In the ﬁrst example, w e select the leftmost pedestrian as the target ob ject instance and insert another car into the scene to attack it. The segmentation model has b een successfully attac ked to neglect the p edestrian (see last column), while it does exist in the scene (see second-to-last column). In the second example, we insert an adv ersarial car in the scene by SemanticA dv and the cyclist has b een recognized as a p edestrian b y the segmentation model. 5 Conclusions Ov erall we presented a nov el attack metho d SemanticA dv , whic h is capable of generating seman tically meaningful adv ersarial p erturbations guided by single seman tic attribute. Compared to existing metho ds, SemanticA dv works in a more con trollable fashion. Exp erimen tal ev aluations on face veriﬁcation and landmark Seman ticAdv 15 detection demonstrate several unique prop erties including attac k transferability . W e b eliev e this w ork w ould op en up new research opp ortunities and c hallenges in the ﬁeld of adversarial learning. F or instance, how to leverage semantic information to defend against suc h attacks will lead to potential new discussions. Ac knowledgemen t This w ork was supp orted in part by the National Science F oundation under Grant CNS-1422211, CNS-1616575, IIS-1617767, DARP A under Gran t 00009970, and Go ogle PhD F ello wship to X. Y an. 16 H. Qiu et al. References 1. Alibaba Cloud Computing Co. Ltd. https://help.aliyun.com/knowledge_ detail/53535.html 2. Megvii T echnology Co. Ltd. https://console.faceplusplus.com/documents/ 5679308 3. Bau, D., Zhu, J.Y., Strobelt, H., Zhou, B., T enen baum, J.B., F reeman, W.T., T orralba, A.: Gan dissection: Visualizing and understanding generative adv ersarial net works. arXiv preprint arXiv:1811.10597 (2018) 4. Bengio, Y., Mesnil, G., Dauphin, Y., Rifai, S.: Better mixing via deep represen tations. In: ICML (2013) 5. Bhattad, A., Chong, M.J., Liang, K., Li, B., F orsyth, D.: Unrestricted adversarial examples via semantic manipulation. In: International Conference on Learning Represen tations (2020) 6. Bro c k, A., Donahue, J., Simony an, K.: Large scale gan training for high ﬁdelity natural image synthesis. In: ICLR (2019) 7. Bro wn, T.B., Carlini, N., Zhang, C., Olsson, C., Christiano, P ., Go odfellow, I.: Unrestricted adversarial examples. arXiv preprint arXiv:1809.08352 (2018) 8. Bulat, A., Tzimirop oulos, G.: Binarized conv olutional landmark lo calizers for h uman p ose estimation and face alignment with limited resources. In: Pro ceedings of the IEEE International Conference on Computer Vision. pp. 3706–3714 (2017) 9. Bulat, A., Tzimirop oulos, G.: How far are we from solving the 2d & 3d face alignmen t problem?(and a dataset of 230,000 3d facial landmarks). In: ICCV (2017) 10. Carlini, N., W agner, D.: T o wards ev aluating the robustness of neural net works. In: 2017 IEEE Symp osium on Security and Priv acy (S&P). IEEE (2017) 11. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy , K., Y uille, A.L.: Deeplab: Seman tic image segmentation with deep conv olutional nets, atrous conv olution, and fully connected crfs. IEEE transactions on pattern analysis and machine in telligence 40 (4), 834–848 (2017) 12. Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Cho o, J.: Stargan: Uniﬁed generative adv ersarial net works for multi-domain image-to-image translation. In: CVPR (2018) 13. Cisse, M., Adi, Y., Nevero v a, N., Keshet, J.: Houdini: F o oling deep structured prediction mo dels. In: NIPS (2017) 14. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., F ranke, U., Roth, S., Sc hiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Pro ceedings of the IEEE conference on computer vision and pattern recognition. pp. 3213–3223 (2016) 15. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4690–4699 (2019) 16. Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., Li, J.: Boosting adv ersarial attac ks with momen tum. In: Pro ceedings of the IEEE conference on computer vision and pattern recognition. pp. 9185–9193 (2018) 17. Dziugaite, G.K., Ghahramani, Z., Ro y , D.M.: A study of the eﬀect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853 (2016) 18. Engstrom, L., T ran, B., Tsipras, D., Schmidt, L., Madry , A.: A rotation and a translation suﬃce: F ooling cnns with simple transformations. arXiv preprin t arXiv:1712.02779 (2017) 19. F arhadi, A., Endres, I., Hoiem, D., F orsyth, D.: Describing ob jects b y their attributes. In: CVPR. IEEE (2009) Seman ticAdv 17 20. Go odfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., W arde-F arley , D., Ozair, S., Courville, A., Bengio, Y.: Generative adv ersarial nets. In: NIPS (2014) 21. Go odfellow, I.J., Shlens, J., Szegedy , C.: Explaining and harnessing adversarial examples. In: ICLR (2014) 22. Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In: ECCV. Springer (2016) 23. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016) 24. Hong, S., Y an, X., Huang, T.S., Lee, H.: Learning hierarchical semantic image manipulation through structured represen tations. In: NeurIPS (2018) 25. Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: A database forstudying face recognition in unconstrained en vironments. In: W orkshop on faces in’Real-Life’Images: detection, alignment, and recognition (2008) 26. Isola, P ., Zh u, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with condi- tional adversarial net works. In: CVPR. pp. 1125–1134 (2017) 27. Johnson, J., Alahi, A., F ei-F ei, L.: P erceptual losses for real-time style transfer and sup er-resolution. In: ECCV. Springer (2016) 28. Johnson, J., Gupta, A., F ei-F ei, L.: Image generation from scene graphs. In: CVPR. pp. 1219–1228 (2018) 29. Joshi, A., Mukherjee, A., Sark ar, S., Hegde, C.: Seman tic adv ersarial attacks: Para- metric transformations that fool deep classiﬁers. arXiv preprin t (2019) 30. Kang, D., Sun, Y., Hendrycks, D., Brown, T., Steinhardt, J.: T esting robustness against unforeseen adversaries. arXiv preprint arXiv:1908.08016 (2019) 31. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressiv e gro wing of gans for improv ed qualit y , stabilit y , and v ariation. In: ICLR (2018) 32. Kemelmac her-Shlizerman, I., Seitz, S.M., Miller, D., Brossard, E.: The megaface b enc hmark: 1 million faces for recognition at scale. In: CVPR. pp. 4873–4882 (2016) 33. Kingma, D.P ., Ba, J.: Adam: A metho d for sto c hastic optimization. In: ICLR (2015) 34. Kingma, D.P ., W elling, M.: Auto-enco ding v ariational bay es. In: ICLR (2014) 35. Klare, B.F., Klein, B., T ab orsky , E., Blan ton, A., Cheney , J., Allen, K., Grother, P ., Mah, A., Jain, A.K.: Pushing the frontiers of unconstrained face detection and recognition: Iarpa janus b enc hmark a. In: CVPR (2015) 36. Krizhevsky , A., Sutskev er, I., Hinton, G.E.: Imagenet classiﬁcation with deep con volutional neural netw orks. In: NIPS (2012) 37. Kumar, N., Berg, A.C., Belhumeur, P .N., Na y ar, S.K.: A ttribute and simile classiﬁers for face veriﬁcation. In: ICCV. IEEE (2009) 38. Li, X., Li, F.: Adv ersarial examples detection in deep netw orks with con volutional ﬁlter statistics. In: Pro ceedings of the IEEE In ternational Conference on Computer Vision. pp. 5764–5772 (2017) 39. Liu, M.Y., Breuel, T., Kautz, J.: Unsup ervised image-to-image translation netw orks. In: NIPS (2017) 40. Liu, Z., Luo, P ., W ang, X., T ang, X.: Deep learning face attributes in the wild. In: ICCV (2015) 41. Long, J., Shelhamer, E., Darrell, T.: F ully conv olutional netw orks for semantic segmen tation. In: Pro ceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440 (2015) 42. Madry , A., Makelo v, A., Sc hmidt, L., Tsipras, D., Vladu, A.: T ow ards deep learning mo dels resistant to adversarial attacks. In: ICLR (2018) 43. Mahendran, A., V edaldi, A.: Understanding deep image representations b y inv erting them. In: CVPR (2015) 18 H. Qiu et al. 44. Mansimo v, E., Parisotto, E., Ba, J.L., Salakhutdino v, R.: Generating images from captions with attention. In: ICLR (2015) 45. Mo osa vi-Dezfooli, S.M., F awzi, A., F rossard, P .: Deepfo ol: a simple and accurate metho d to fool deep neural netw orks. In: Proceedings of the IEEE Conference on Computer Vision and P attern Recognition. pp. 2574–2582 (2016) 46. Mosc hoglou, S., P apaioannou, A., Sagonas, C., Deng, J., Kotsia, I., Zafeiriou, S.: Agedb: the ﬁrst manually collected, in-the-wild age database. In: CVPR W orkshops. pp. 51–59 (2017) 47. New ell, A., Y ang, K., Deng, J.: Stack ed hourglass net works for human p ose es- timation. In: Europ ean Conference on Computer Vision. pp. 483–499. Springer (2016) 48. Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classiﬁer gans. In: ICML. JMLR (2017) 49. V an den Oord, A., Kalch brenner, N., Esp eholt, L., Viny als, O., Gra ves, A., et al.: Conditional image generation with pixelcnn deco ders. In: NIPS (2016) 50. Oord, A.v.d., Kalc hbrenner, N., Kavuk cuoglu, K.: Pixel recurrent neural netw orks. In: ICML (2016) 51. P ap ernot, N., McDaniel, P ., Jha, S., F redrikson, M., Celik, Z.B., Sw ami, A.: The lim- itations of deep learning in adv ersarial settings. In: Security and Priv acy (EuroS&P), 2016 IEEE Europ ean Symp osium on (2016) 52. P arikh, D., Grauman, K.: Relative attributes. In: ICCV. IEEE (2011) 53. P arkhi, O.M., V edaldi, A., Zisserman, A., et al.: Deep face recognition. In: bmv c. v ol. 1, p. 6 (2015) 54. P aszke, A., Gross, S., Chintala, S., Chanan, G., Y ang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic diﬀerentiation in PyT orc h. In: NIPS Auto diﬀ W orkshop (2017) 55. Radford, A., Metz, L., Chin tala, S.: Unsup ervised representation learning with deep con volutional generative adv ersarial netw orks. In: ICLR (2015) 56. Reed, S., Ak ata, Z., Y an, X., Logeswaran, L., Schiele, B., Lee, H.: Generativ e adv ersarial text to image syn thesis. In: ICML (2016) 57. Reed, S., Sohn, K., Zhang, Y., Lee, H.: Learning to disen tangle factors of v ariation with manifold interaction. In: ICML (2014) 58. Sagonas, C., Tzimirop oulos, G., Zafeiriou, S., Pan tic, M.: 300 faces in-the-wild c hallenge: The ﬁrst facial landmark lo calization challenge. In: ICCV W orkshop (2013) 59. Sc hroﬀ, F., Kalenichenk o, D., Philbin, J.: F acenet: A uniﬁed embedding for face recognition and clustering. In: Pro ceedings of the IEEE conference on computer vision and pattern recognition. pp. 815–823 (2015) 60. Sengupta, S., Chen, J.C., Castillo, C., Patel, V.M., Chellappa, R., Jacobs, D.W.: F rontal to proﬁle face v eriﬁcation in the wild. In: W A CV. pp. 1–9. IEEE (2016) 61. Simon yan, K., Zisserman, A.: V ery deep conv olutional netw orks for large-scale image recognition. arXiv preprin t arXiv:1409.1556 (2014) 62. Song, Y., Sh u, R., Kushman, N., Ermon, S.: Constructing unrestricted adversarial examples with generative mo dels. In: Adv ances in Neural Information Processing Systems. pp. 8312–8323 (2018) 63. Sun, Y., W ang, X., T ang, X.: Deep learning face representation from predicting 10,000 classes. In: CVPR (2014) 64. Szegedy , C., Liu, W., Jia, Y., Sermanet, P ., Reed, S., Anguelov, D., Erhan, D., V anhouck e, V., Rabinovic h, A., et al.: Going deep er with conv olutions. In: CVPR (2015) Seman ticAdv 19 65. Szegedy , C., Zaremba, W., Sutsk ever, I., Bruna, J., Erhan, D., Goo dfello w, I., F ergus, R.: In triguing properties of neural net works. arXiv preprint (2013) 66. T ao, G., Ma, S., Liu, Y., Zhang, X.: Attac ks meet in terpretability: Attribute-steered detection of adversarial samples. In: NeurIPS (2018) 67. W ang, H., W ang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., Liu, W.: Cosface: Large margin cosine loss for deep face recognition. In: CVPR (2018) 68. W ang, T.C., Liu, M.Y., Zh u, J.Y., T ao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: CVPR (2018) 69. W ong, E., Schmidt, F.R., Kolter, J.Z.: W asserstein adv ersarial examples via pro- jected sinkhorn iterations. ICML (2019) 70. Xiao, C., Deng, R., Li, B., Y u, F., Liu, M., Song, D.: Characterizing adv ersarial examples based on spatial consistency information for semantic segmentation. In: ECCV (2018) 71. Xiao, C., Li, B., Zhu, J.Y., He, W., Liu, M., Song, D.: Generating adversarial examples with adversarial netw orks. In: IJCAI (2018) 72. Xiao, C., Zh u, J.Y., Li, B., He, W., Liu, M., Song, D.: Spatially transformed adv ersarial examples. In: ICLR (2018) 73. Xie, C., Zhang, Z., Zhou, Y., Bai, S., W ang, J., Ren, Z., Y uille, A.L.: Improving transferabilit y of adversarial examples with input div ersity . In: Pro ceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2730–2739 (2019) 74. Xu, W., Ev ans, D., Qi, Y.: F eature squeezing: Detecting adversarial examples in deep neural netw orks. arXiv preprint arXiv:1704.01155 (2017) 75. Y an, X., Y ang, J., Sohn, K., Lee, H.: A ttribute2image: Conditional image generation from visual attributes. In: ECCV. Springer (2016) 76. Y ao, S., Hsu, T.M., Zh u, J.Y., W u, J., T orralba, A., F reeman, B., T enen baum, J.: 3d- a ware scene manipulation via inv erse graphics. In: Adv ances in neural information pro cessing systems. pp. 1887–1898 (2018) 77. Y u, F., Koltun, V., F unkhouser, T.: Dilated residual netw orks. In: Computer Vision and Pattern Recognition (CVPR) (2017) 78. Zhang, H., Xu, T., Li, H., Zhang, S., W ang, X., Huang, X., Metaxas, D.N.: Stac kgan: T ext to photo-realistic image synthesis with stack ed generative adversarial net works. In: ICCV (2017) 79. Zhang, M., Zhang, Y., Zhang, L., Liu, C., Kh urshid, S.: Deeproad: Gan-based meta- morphic testing and input v alidation framework for autonomous driving systems. In: Pro ceedings of the 33rd ACM/IEEE In ternational Conference on Automated Soft ware Engineering. pp. 132–142 (2018) 80. Zhang, X., Y ang, L., Y an, J., Lin, D.: Accelerated training for massiv e classiﬁcation via dynamic class selection. In: Thirty-Second AAAI Conference on Artiﬁcial In telligence (2018) 81. Zh u, J.Y., Park, T., Isola, P ., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adv ersarial net works. In: ICCV (2017) 82. Zh u, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: F ace alignment across large p oses: A 3d solution. In: CVPR (2016) 20 H. Qiu et al. A Implemen tation details In this section, w e provide implemen tation details used in our exp erimen ts. W e implemen t our SemanticA dv using PyT orch [ 54 ]. Our implementation will b e a v ailable after the ﬁnal decision. A.1 F ace identit y veriﬁcation W e use Adam optimizer [ 33 ] to generate adversarial examples for b oth our SemanticA dv and the pixel-wise attac k metho d CW [ 10 ]. More sp eciﬁcally , w e run optimization for up to 200 steps with a ﬁxed up dating rate 0 . 05 under G-FPR < 10 − 4 . Under cases with a slightly higher G-FPR, we run the optimization for up to 500 steps with a ﬁxed up dating rate 0 . 01. F or the pixel-wise attac k metho d CW, w e use additional pixel reconstruction ob jective with the weigh t set to 5. Sp eciﬁcally , we run optimization for up to 1 , 000 steps with a ﬁxed up dating rate 10 − 3 . Evaluation metrics. T o ev aluate the p erformance of SemanticA dv under diﬀerent attributes, w e consider three metrics as follows: – Best : the attac k is successful as long as one single attribute among 17 can b e successfully attac ked; – Aver age : we calculate the av erage attack success rate among 17 attributes for the same face iden tity; – Worst : the attac k is successful only when all of 17 attributes can b e successfully attac ked; Please note that we use the Best metric as a fair comparison to the attack success rate rep orted by existing pixel-wise attack methods, while SemanticA dv can b e generated with diﬀerent attributes as one of our adv antages. In practice, b oth our SemanticA dv ( Best ) and CW ac hieve 100% attac k success rate. In addition, w e rep ort the p erformance using the aver age and worst metric, which enables us to analyze the adv ersarial robustness tow ards certain semantic attributes. Pixel-wise defense metho ds. Feature squeezing [ 74 ] is a simple but eﬀective metho d by reducing color bit depth to remov e the adv ersarial eﬀects. W e compress the image represen ted by 8 bits for each c hannel to 4 bits for each channel to ev aluate the eﬀectiveness. F or Blurring [ 38 ], we use a 3 × 3 Gaussian k ernel with standard deviation 1 to smooth the adversarial p erturbations. F or JPEG [ 17 ], it lev erages the compression and decompression to remo ve the adversarial p ertur- bation. W e set the compression ratio as 0 . 75 in our exp erimen t. A.2 F ace landmark detection W e use Adam optimizer [ 33 ] to generate SemanticA dv against the face landmark detection model. Sp eciﬁcally , w e run optimization for up to 2 , 000 steps with a ﬁxed up dating rate 0 . 05 with the balancing factor λ set to 0 . 01 (see Eq. 3 in the main pap er). Seman ticAdv 21 Evaluation Metrics. W e apply diﬀerent metrics for tw o adversarial attac k tasks, resp ectiv ely . F or “Rotating Ey es” task, w e use a widely adopted metric Normalize d Me an Err or (NME) [9] for exp erimen tal ev aluation. r NME = 1 N N X k =1 || p k − ˆ p k || 2 √ W B ∗ H B , (7) where p k denotes the k -th ground-truth landmark, ˆ p k denotes the k -th predicted landmark and √ W B ∗ H B is the square-ro ot area of ground-truth b ounding box, where W B and H B represen ts the width and height of the box. F or “Out of Region” task, we consider the attac k is successful if the landmark predictions fall outside a pre-deﬁned centering region on the p ortrait image. W e introduce a metric that reﬂects the p ortion of landmarks outside of the pre-deﬁned centering region: r OR = N out N total , where N out denotes the num b er of predicted landmarks outside the pre-deﬁned b ounding b o x and N total denotes the total n umber of landmarks. A.3 Ablation study: feature-space interpolation W e include an ablation study on feature-space interpolation by analyzing attack success rates using diﬀerent feature-maps in the main pap er. W e illustrate the c hoices of StarGAN feature-maps used in Figure 8. T able 1 in the main pap er sho ws the attack success rate on R-101-S. As shown in Figure 8, we use f i to represent the feature-map after i -th up-sampling op eration. f 0 denotes the feature-map b efore applying up-sampling operation. The result demonstrates that samples generated b y interpolating on f 0 ac hieve the highest success rate. Since f 0 is the feature-map b efore decoder, it still well embeds semantic information in the feature space. W e adopt f 0 for in terp olation in our experiments. 128x112x112 256x56x56 512x28x28 f -2 f -1 f 0 f 1 f 2 256x56x56 128x112x112 512x28x28 Visual Attribute Original Image f -1 f 0 f 1 f 2 Output Image 512 * 28 2 256 * 56 2 128 * 112 2 Fig. 8: The illustration of the features we used in StarGAN enco der-decoder arc hitecture. B Additional quan titative results B.1 F ace identit y veriﬁcation Benchmark p erformanc e. W e pro vide additional information ab out the ResNet mo dels used in the exp erimen ts. T able 4 illustrates the p erformance on m ultiple 22 H. Qiu et al. face identit y v eriﬁcation b enc hmarks including Lab eled F ace in the Wild (LFW) dataset [ 25 ], AgeDB-30 dataset [ 46 ], and Celebrities in F rontal-Proﬁle (CFP) dataset [ 60 ]. LFW [ 25 ] is the de facto standard testing set for face veriﬁcation under unconstrained conditions, whic h contains 13 , 233 face images from 5 , 749 iden tities. AgeDB [ 46 ] con tains 12 , 240 images from 440 iden tities. AgeDB-30 is the most c hallenging subsets for ev aluating face veriﬁcation mo dels. The large v ariations in age mak es the face mo del perform w orse on this dataset than on LFW. CFP [ 60 ] consists of 500 identities, where each identit y has 10 frontal and 4 proﬁle images. Although go od p erformance has b een achiev ed on the F ron tal-to-F rontal (CFP-FF) test proto col, the F ron tal-to-Proﬁle (CFP-FP) test proto col still remains challenging as most of the face training sets hav e very few proﬁle face images. T able 4 indicates that the used face v eriﬁcation mo del ac hieves state-of-the-art under all benchmarks. T able 4: The veriﬁcation accuracy (%) of ResNet mo dels on m ultiple face recog- nition datasets including LFW, AgeDB-30, and CFP . M / b enc hmarks LFW AgeDB-30 CFP-FF CFP-FP R-50-S 99.27 94.15 99.26 91.49 R-101-S 99.42 95.93 99.57 95.07 R-50-C 99.38 95.08 99.24 90.24 R-101-C 99.67 95.58 99.57 92.71 Thr esholds for identity veriﬁc ation. T o decide whether t w o p ortrait images b elong to the same identit y or not, we use the normalized L 2 distance b et w een face features and set the FPR thresholds accordingly , which is a commonly used pro cedure when ev aluating the face veriﬁcation model [ 35 , 32 ]. T able 5 illustrates the threshold v alues used in our experiments when determining whether tw o p ortrait images b elong to the same iden tity or not. T able 5: The threshold v alues for face iden tity veriﬁcation. FPR/ M R-50-S R-101-S R-50-C R-101-C 10 − 3 1.181 1.244 1.447 1.469 3 × 10 − 4 1.058 1.048 1.293 1.242 10 − 4 0.657 0.597 0.864 0.809 Quantitative analysis. Com bining the results from T able 6 and Figure 4 in the main pap er, we understand that the face veriﬁcation mo dels used in our exp erimen ts hav e diﬀerent levels of robustness across attributes. F or example, Seman ticAdv 23 face veriﬁcation mo dels are more robust against lo cal shap e v ariations than color v ariations, e.g., pale skin has higher attack success rate than mouth op en. W e b eliev e these discov eries will help the communit y further understand the prop erties of face veriﬁcation mo dels. T able 6 sho ws the ov erall p erformance (accuracy) of face veriﬁcation mo del and attac k success rate of SemanticA dv and CW. As shown in T able 6, although the face model trained with cos ob jective achiev es higher face recognition p er- formance, it is more vulnerable to adversarial attac k compared with the mo del trained with softmax ob jective. T able 7 sho ws that the intermediate results of SemanticA dv b efore adversarial perturbation cannot attack successfully , which in- dicates the success of SemanticA dv comes from adding adversarial perturbations through in terp olation. T able 6: Quantitativ e results of identit y v eriﬁcation (%). It sho ws accuracy of face v eriﬁcation mo del and attac k success rate of SemanticA dv and CW. G-FPR Metrics / M R-50-S R-101-S R-50-C R-101-C 10 − 3 V eriﬁcation Accuracy 98.36 98.78 98.63 98.84 SemanticA dv ( Best ) 100.00 100.00 100.00 100.00 SemanticA dv ( Worst ) 91.95 93.98 99.53 99.77 SemanticA dv ( Aver age ) 98.98 99.29 99.97 99.99 CW 100.00 100.00 100.00 100.00 3 × 10 − 4 V eriﬁcation Accuracy 97.73 97.97 97.91 97.85 SemanticA dv ( Best ) 100.00 100.00 100.00 100.00 SemanticA dv ( Worst ) 83.75 79.06 98.98 96.64 SemanticA dv ( Aver age ) 97.72 97.35 99.92 99.72 CW 100.00 100.00 100.00 100.00 10 − 4 V eriﬁcation Accuracy 93.25 92.80 93.43 92.98 SemanticA dv ( Best ) 100.00 100.00 100.00 100.00 SemanticA dv ( Worst ) 33.59 19.84 67.03 48.67 SemanticA dv ( Aver age ) 83.53 76.64 95.57 91.13 CW 100.00 100.00 100.00 100.00 B.2 F ace landmark detection W e presen t the quantitativ e results of SemanticA dv on face landmark detection mo del in T able 8 including tw o adversarial tasks, namely , “Rotating Eyes” and “Out of Region”. W e observe that our metho d is eﬃcient to p erform attacking on landmark detection models. F or certain attributes such as “Ey eglasses” and “P ale Skin”, SemanticA dv achiev es reasonably-go od p erformance. 24 H. Qiu et al. T able 7: Attac k success rate of the in termediate output of SemanticA dv (%). x 0 , G ( x 0 , c ) and G ( x 0 , c new ) are the intermediate results of our metho d b efore adv ersarial p erturbation. G-FPR Metrics / M R-50-S R-101-S R-50-C R-101-C 10 − 3 x 0 0.00 0.00 0.08 0.00 G ( x 0 , c ) 0.00 0.00 0.00 0.23 G ( x 0 , c new )( Best ) 0.16 0.08 0.16 0.31 3 × 10 − 4 x 0 0.00 0.00 0.00 0.00 G ( x 0 , c ) 0.00 0.00 0.00 0.00 G ( x 0 , c new )( Best ) 0.00 0.00 0.00 0.00 10 − 4 x 0 0.00 0.00 0.00 0.00 G ( x 0 , c ) 0.00 0.00 0.00 0.00 G ( x 0 , c new )( Best ) 0.00 0.00 0.00 0.00 T able 8: Quan titative results on face landmark detection (%) The tw o ro w shows the measured ratios (low er is b etter) for “Rotating Eyes” and “Out of Region” task, resp ectiv ely . T asks (Metrics) Pristine Augmented Attributes Blond Hair Y oung Eyeglasses Rosy Cheeks Smiling Arched Eyebro ws Bangs Pale Skin r NME ↓ 28.04 14.03 17.28 8.58 13.24 19.21 23.42 15.99 10.72 r OR ↓ 45.98 17.42 23.04 7.51 16.65 25.44 33.85 20.03 13.51 B.3 User study W e conduct a user study on the adversarial images of SemanticA dv and CW used in the exp erimen t of API-attack and the original images. The adv ersarial images are generated with G-FPR < 10 − 4 for both metho ds. W e present a pair of original image and adversarial image to participants and ask them to rank the t wo options. The order of these t wo images is randomized and the images are displa yed for 2 seconds in the screen during each trial. After the images disapp ear, the participants hav e unlimited time to select the more reasonably-looking image according to their p erception. T o maintain the high quality of the collected resp onses, each participant can only conduct at most 50 trials, while and each adv ersarial image was shown to 5 diﬀerent participan ts. W e present the images w e used for user study in Figure 9. In total, w e collect 2 , 620 annotations from 77 participan ts. In 39 . 14 ± 1 . 96% of trials the adversarial images generated b y SemanticA dv are selected as reasonably-lo oking images and in 30 . 27 ± 1 . 96% of trails, the adversarial images generated b y CW are selected as reasonably-looking images. It indicates that our semantic adversarial examples are more p erceptual reasonably-lo oking than CW. Additionally , we also conduct the user study with larger G-FPR= 10 − 3 . In 45 . 42 ± 1 . 96% of trials, the adversarial images generated Seman ticAdv 25 T able 9: T ransferabilit y of SemanticA dv : cell ( i, j ) sho ws attack success rate of adv ersarial examples generated against j -th model and ev aluate on i -th model. Results are generated with G-FPR = 10 − 4 and T-FPR = 10 − 4 . M test / M opt R-50-S R-101-S R-50-C R-101-C R-50-S 1.000 0.005 0.000 0.000 R-101-S 0.000 1.000 0.000 0.000 R-50-C 0.000 0.000 1.000 0.000 R-101-C 0.000 0.000 0.000 1.000 b y SemanticA dv are selected as reasonably-looking images, which is very close to the random guess (50%). Id:31, +Rosy Cheeks Id:129, +Rosy Cheeks Id:83, +Rosy Cheeks Id:55, +Rosy Cheeks Id:20, +Rosy Cheeks Id:28, (middle column) +Smiling Id:13, +Smiling Id:1 18, +Smiling Id:122, +Smiling Id:92, +Smiling Id:141, +Pale Skin Id:30, +Pale Skin Id:165, +Pale Skin Id:86, +Pale Skin Id:134,, +Pale Skin GT | CW | our SemanticAdv G T CW SemanticA dv G T CW SemanticA dv G T CW SemanticA dv +Rosy Cheeks +Smiling +Pale Skin Fig. 9: Qualitativ e comparisons among ground truth, pixel-wise adversarial ex- amples generated by CW, and our prop osed SemanticA dv . Here, we present the results from G-FPR < 10 − 4 so that p erturbations are visible. B.4 Seman tic attack transferabilit y In T able 9, w e present the quantitativ e results of the attack transferabilit y under the setting with G-FPR = 10 − 4 and T-FPR = 10 − 4 . W e observe that with more strict testing criterion (low er T-FPR) of the veriﬁcation mo del, the transferability b ecomes low er across diﬀerent mo dels. T o further sho wcase that our SemanticA dv is non-trivially diﬀerent from pixel- wise attac k added on top of seman tic image editing, we provide one additional baseline called StarGAN+CW and ev aluate its attack transferability . This baseline ﬁrst p erforms semantic image editing using the StarGAN model (non-adversarial) and then conducts the standard L p CW attacks on the generated images. As 26 H. Qiu et al. T able 10: T ransferabilit y of StarGAN+CW : cell ( i, j ) sho ws attack success rate of adv ersarial examples generated against j -th model and ev aluate on i -th model. Results of SemanticA dv are listed in brack ets. M test / M opt R-101-S R-50-S 0.035 (0.108) R-101-S 1.000 (1.000) R-50-C 0.145 (0.202) R-101-C 0.085 (0.236) (a) G-FPR=10 − 3 , T-FPR=10 − 3 M test / M opt R-101-S R-50-S 0.615 (0.862) R-101-S 1.000 (1.000) R-50-C 0.570 (0.837) R-101-C 0.695 (0.888) (b) G-FPR=10 − 4 , T-FPR=10 − 3 sho wn in T able 10, the StarGAN+CW baseline has noticeable p erformance gap to our proposed SemanticA dv . This also justiﬁes that our SemanticA dv is able to pro duce nov el adversarial examples which cannot b e simply achiev ed by com bining attribute-conditioned image editing mo del with L p b ounded p erturbation. B.5 Query-free black-box API attack Fig. 10: Illustration of our SemanticA dv in the real world face v eriﬁcation platform (editing on pale skin). Note that the conﬁdence denotes the lik eliho od that tw o faces b elong to the same p erson. In T able 11, we prese n t the results of SemanticA dv p erforming query-free black- b o x attac k on three online face veriﬁcation platforms. SemanticA dv outp erforms Seman ticAdv 27 T able 11: Quan titative analysis on query-free blac k-b o x attack. W e use ResNet-101 optimized with softmax loss for ev aluation and rep ort the attac k success rate(%). Note that for Micresoft Azure API, it do es not provide the accept thresholds for diﬀeren t T-FPRs and th us we use the pro vided likelihoo d 0.5 to determine whether t wo faces belong to the same p erson. API name F ace++ AliY un Azure Metric T-FPR T-FPR Likelihoo d A ttack er / Metric v alue 10 − 3 10 − 4 10 − 3 10 − 4 0.5 Original x 2.04 0.51 0.50 0.00 0.00 Generated x new 4.21 0.53 0.50 0.00 0.00 CW (G-FPR = 10 − 3 ) 9.18 2.04 2.00 0.50 0.00 StarGAN+CW (G-FPR = 10 − 3 ) 15.9 3.08 3.50 1.00 0.00 SemanticA dv (G-FPR = 10 − 3 ) 20.00 4.10 4.00 0.50 0.00 CW (G-FPR = 10 − 4 ) 28.57 10.17 10.50 2.50 1.04 StarGAN+CW (G-FPR = 10 − 4 ) 35.38 14.36 12.50 3.50 1.05 SemanticA dv (G-FPR = 10 − 4 ) 58.25 31.44 24.00 10.50 5.73 CW 37.24 20.41 18.00 9.50 3.09 StarGAN+CW 47.45 26.02 20.00 8.50 5.56 MI-F GSM [16] 53.89 30.57 29.50 17.50 10.82 M-DI 2 -F GSM [73] 56.12 33.67 30.00 18.00 12.04 SemanticA dv (G-FPR < 10 − 4 ) 67.69 48.21 36.5 19.5 15.63 CW and StarGAN+CW in all APIs under all FPR thresholds. In addition, under the same T-FPR, we achiev e higher attac k success rate on APIs using samples generated using low er G-FPR compared to samples generated using higher G-FPR. Original x and generated x new are regarded as reference p oin t of the p erformance of online face veriﬁcation platforms. In Figure 10, we also sho w sev eral examples of our API attack on Microsoft Azure face v eriﬁcation system, whic h further demonstrates the eﬀectiveness of our approac h. B.6 SemanticA dv against adversarial training W e ev aluate our SemanticA dv against the existing adversarial training based defense metho d [ 42 ]. In detail, we randomly sample 10 p ersons from CelebA [ 40 ] and then randomly split the sampled dataset in to training set, v alidation set and testing set according to a prop ortion of 80%, 10% and 10%, resp ectiv ely . W e train a ResNet-50 [ 23 ] to identify these face images by following the standard face recognition training pip eline [ 63 ]. As CelebA [ 40 ] do es not contain enough images for eac h p erson, w e ﬁnetune our mo del from a pretrained mo del trained on MS- Celeb-1M [ 22 , 80 ]. W e train the robust mo del by using adv ersarial training based metho d [ 42 ]. In detail, w e follow the same setting in [ 42 ]. W e use 7-step PGD L ∞ attac k to generate adv ersarial examples to solv e the inner maximum problem for adversarial training. During test pro cess, w e ev aluate by using adversarial examples generated by 20-step PGD attacks. The perturbation is b ounded b y 8 pixel (ranging from [0,255]) in terms of L ∞ distance). 28 H. Qiu et al. T able 12: Accuracy on standard mo del (without adversarial training) and robust mo del (with adversarial training). T raining Method / A ttack Benign PGD SemanticA dv Standard 93.3% 0% 0% Robust [42] 86.7% 46.7% 10% As shown in T able 12, the robust mo del achiev es 10% accuracy against the adversarial examples generated by SemanticA dv , while 46.7% against the adv ersarial examples generated by PGD [ 42 ]. It indicates that existing adversarial training based defense method is less eﬀective against SemanticA dv . It further demonstrates that our SemanticA dv iden tiﬁes an unexplored research area beyond previous L p -based ones. Seman ticAdv 29 C Additional visualizations -bush y eyebrows -wavy hair -young +eyeglasses -heavy mak eup -rosy cheeks +chubb y +mouth slightly open +blonde hair -wearing lipstick +smiling -arched eyebrows -bangs +wearing earrings +bags under eyes +receding hairline +pale skin -bush y eyebrows +wavy hair +young +eyeglasses +heavy mak eup +rosy cheeks -chubb y +mouth slightly open +blonde hair +wearing lipstick -smiling +arched eyebrows +bangs +wearing earrings -bags under eyes +receding hairline +pale skin S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) 36: [0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0] wavy hair , young, heavy makeup, rosy cheeks, bushy eyebrows, wearing lipstick, arched eyebrows, bangs 75: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] chubby , bushy eyebrows, smiling, bags under eyes Source Image T arget Image Gener ation via attribute-conditioned image editing -bush y eyebrows +wavy hair -young +eyeglasses +heavy mak eup +rosy cheeks +chubb y -mouth slightly open +blonde hair +wearing lipstick -smiling +arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline +pale skin +bush y eyebrows +wavy hair -young +eyeglasses -heavy mak eup -rosy cheeks +chubb y +mouth slightly open -blonde hair -wearing lipstick -smiling -arched eyebrows +bangs -wearing earrings +bags under eyes +receding hairline S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) S ynthesized image with augmented attribute ( ɑ =0) A dversarial image b y optimizing interpolation par ameter ( ɑ =argmin ɑ L(.)) 100: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] young, mouth slightly open, bushy eyebrows, smiling 134: [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0] blond hair , young, heavy makeup, rosy cheeks, wearing lipstick, smiling, arched eyebrows, wearing earrings +pale skin Fig. 11: Qualitative analysis on single-attribute adversarial attack (G-FPR=10 − 3 ). 30 H. Qiu et al. -bush y eyebrows +wavy hair -young +eyeglasses +heavy mak eup +rosy cheeks +chubb y +mouth slightly open +blonde hair +wearing lipstick +smiling +arched eyebrows +bangs +wearing earrings -bags under eyes +receding hairline +pale skin -bush y eyebrows +wavy hair -young +eyeglasses +heavy mak eup +rosy cheeks +chubb y +mouth slightly open +blonde hair +wearing lipstick +smiling +arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline +pale skin S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) 152: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] young, bushy eyebrows, bags under eyes 154: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] young, bushy eyebrows Source Image T arget Image Gener ation via attribute-conditioned image editing +bush y eyebrows +wavy hair -young +eyeglasses -heavy mak eup +rosy cheeks +chubb y -mouth slightly open +blonde hair -wearing lipstick -smiling -arched eyebrows +bangs -wearing earrings -bags under eyes +receding hairline +pale skin +bush y eyebrows +wavy hair -young +eyeglasses -heavy mak eup +rosy cheeks +chubb y -mouth slightly open +blonde hair -wearing lipstick +smiling -arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline +pale skin S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) 132: [0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0] young, heavy makeup, mouth slightly open, wearing lipstick, smiling, arched eyebrows, wearing earrings, bags under eyes 189: [0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0] young, heavy makeup, mouth slightly open, wearing lipstick, arched eyebrows Fig. 12: Qualitative analysis on single-attribute adversarial attack (G-FPR=10 − 3 ). Seman ticAdv 31 Source Image T arget Image +bush y eyebrows -wavy hair -young +eyeglasses -heavy mak eup +rosy cheeks +chubb y -mouth slightly open +blonde hair -wearing lipstick -smiling -arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline -pale skin 12: [0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0] wavy hair , young, heavy makeup, mouth slightly open, wearing lipstick, smiling, arched eyebrows, pale skin 30: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] young, bushy eyebrows, bags under eyes -bush y eyebrows +wavy hair -young +eyeglasses +heavy mak eup +rosy cheeks +chubb y +mouth slightly open +blonde hair +wearing lipstick +smiling +arched eyebrows +bangs +wearing earrings -bags under eyes +receding hairline +pale skin S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) Gener ation via attribute-conditioned image editing S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) +bush y eyebrows +wavy hair -young +eyeglasses +heavy mak eup +rosy cheeks +chubb y +mouth slightly open +blonde hair +wearing lipstick +smiling +arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline +pale skin +bush y eyebrows +wavy hair -young +eyeglasses +heavy mak eup +rosy cheeks +chubb y -mouth slightly open +blonde hair +wearing lipstick -smiling +arched eyebrows +bangs +wearing earrings +bags under eyes +receding hairline +pale skin S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) S ynthesized image with augmented attribute ( β =0) A dversarial image b y optimizing interpolation par ameter ( β =argmin β L(.)) 4: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] young, mouth slightly open, smiling Fig. 13: Qualitative analysis on single-attribute adversarial attack (G-FPR=10 − 3 ). 32 H. Qiu et al. Semantic A dversarial Examples 30 +rosy cheeks +smiling +receding hairline Source Image T arget Image A dversarial Examples using CW -young 69 +blonde hair -receding hairline +rosy cheeks +bush y eyebrows 10 +bangs +mouth slightly open +rosy cheeks +chubb y 28 +wearing lipsticks +eyeglasses +arched eyebrows -smiling 24 +bangs -smiling +pale skin +blonde hair 100-500 +eyeglasses +bangs +blonde hair -young +rosy cheeks +pale skin +receding hairline +eyeglasses + bags under eyes +pale skin +chubb y -bangs -blonde hair +wearing lipsticks -mouth slightly open +bangs -mouth slightly open +receding hairline +mouth slightly open +pale skin +mouth slightly open +smiling +rosy cheeks +bangs -heavy mak eup +pale skin +bangs +bags under eyes -blonde hair -wearing lipsticks +receding hairline +bush y eyebrows +bangs -bush y eyebrows +mouth slightly open -young 100 132 154 189 249 225 250 208 264 Fig. 14: Qualitative comparisons b et w een our prop osed SemanticA dv (G-FPR = 10 − 3 ) and pixel-wise adv ersarial examples generated by CW. Along with the adv ersarial examples, we also provide the corresp onding perturbations (residual) on the righ t. Seman ticAdv 33 Fig. 15: Qualitative analysis on single-attribute adversarial attac k ( SemanticA dv with G-FPR = 10 − 3 ) by eac h other. Along with the adversarial examples, we also pro vide the corresp onding perturbations (residual) on the right.

SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment