Verification of Neural Networks: Specifying Global Robustness using Generative Models

V eriﬁcation of Neural Net w orks: Sp ecifying Global Robustness using Generativ e Mo dels Nathana¨ el Fijalk ow Mohit Kumar Gupta CNRS, LaBRI, Univ ersit´ e de Bordeaux, and The Alan T uring Institute, London Indian Institute of T ec hnology Bomba y Abstract The success of neural net works across most mac hine learning tasks and the persistence of adv ersarial examples hav e made the v eriﬁca- tion of such models an imp ortan t quest. Sev- eral techniques hav e b een successfully devel- op ed to verify robustness, and are now able to ev aluate neural netw orks with thousands of no des. The main w eakness of this approach is in the speciﬁcation: robustness is asserted on a v alidation set consisting of a ﬁnite set of examples, i.e. lo cally . W e prop ose a notion of global robustness based on generativ e mo dels, which asserts the robustness on a very large and represen tativ e set of examples. W e show ho w this can b e used for verifying neural netw orks. In this pap er w e exp erimen tally explore the merits of this approach, and sho w ho w it can be used to construct realistic adv ersarial examples. 1 In tro duction W e consider the task of certifying the correctness of an image classiﬁer, i.e. a system taking as input an image and categorising it. As a main example we will consider the MNIST classiﬁcation task, which consists in categorising hand-written digits. Our exp erimen- tal results are later repro duced for the drop-in dataset F ashion MNIST (Xiao et al. (2017)). The usual ev aluation pro cedure consists in setting aside from the dataset a v alidation set, and to rep ort on the success p ercentage of the image classiﬁer on the v alidation set. With this pro cedure, it is com- monly accepted that the MNIST classiﬁcation task T echnical report. is solved, with some conv olutional netw orks ac hieving ab o v e 99.7% accuracy (see e.g. Ciregan et al. (2012); W an et al. (2013)). F urther results suggest that even the b est conv olutional netw orks cannot be considered to b e robust, given the p ersistence of adversarial ex- amples: a small p erturbation – invisible to the h uman ey e – in images from the dataset is enough to induce misclassiﬁcation (Szegedy et al. (2014)). This is a k ey motiv ation for the v eriﬁcation of neural net works: can w e assert the robustness of a neural net- w ork, i.e. the absence of adversarial examples? This question has generated a gro wing in terest in the past y ears at the crossing of diﬀerent researc h communities (see e.g. Huang et al. (2017); Katz et al. (2017); W eng et al. (2018); Gehr et al. (2018); Mirman et al. (2018); Gopinath et al. (2018); Katz et al. (2019)), with a range of protot yp e tools ac hieving impressive results. The robustness question is formulated as follows: giv en an image x and ε > 0, are all ε -p erturbations of x cor- rectly classiﬁed? W e point to a w eakness of the formalisation: it is lo c al , meaning it is asserted for a given image x (and then t ypically c hec ked against a ﬁnite set of images). In this pap er, we inv estigate a glob al approach for specifying the robustness of an image classiﬁer. Let us start from the ultimate robustness ob jective, which reads: F or ev ery category , for ev ery r e al-life image of this category and for every p erturb ation of this image, the perturb ed image is correctly classiﬁed. F ormalising this raises three questions: 1. How do we quantify ov er al l real-life images? 2. What are p erturb e d images? 3. How do we eﬀe ctively chec k robustness? In this work we prop ose a formalisation based on gen- erativ e mo dels. A generative mo del is a system taking T ec hnical report as input a random noise and generating images, in other words it represen ts a probabilistic distribution o ver images. Our sp eciﬁcation dep ends on t wo parameters ( ε, δ ). Informally , it reads: An image classiﬁer is ( ε, δ )-robust with re- sp ect to a generative mo del if the probabil- it y that for a noise x , all ε -p erturbations of x generate correctly classiﬁed images is at least 1 − δ . The remainder of the pap er presents exp erimen ts sup- p orting the claims that the global robustness speciﬁ- cation has the follo wing imp ortan t prop erties. Global. The ﬁrst question stated ab o ve is ab out quan tifying ov er all images. The global robustness we prop ose addresses this p oin t b y (implicitly) quan tify- ing ov er a very large and representativ e set of images. Robust. The second question is about the notion of p erturbed images. The essence of generativ e mo dels is to pro duce images reminiscent of real images (from the dataset); hence testing against images given b y a generative mo del includes the very important per- turbation asp ect presen t in the intuitiv e deﬁnition of correctness. Eﬀectiv e. The third question is ab out eﬀectivit y . W e will explain that global robustness can be eﬀectively ev aluated for image classiﬁers built using neural net- w orks. Related w ork Xiao et al. (2018) train generative mo dels for ﬁnding adv ersarial examples, and more speciﬁcally introduce a diﬀerent training procedure (based on a new ob jec- tiv e function) whose goal is to pro duce adversarial ex- amples. Our approach is diﬀerent in that w e use gen- erativ e models with the usual training pro cedure and ob jective, which is to pro duce a wide range of realistic images. 2 Global Correctness This section serv es as a tec hnical w arm-up for the next one: w e in tro duce the notion of glob al c orr e ctness , a step to wards our main deﬁnition of glob al r obustness . W e use R d for represen ting images with || · || the inﬁnity norm ov er R d , and let C be the set of categories, so an image classiﬁer represen ts a function C : R d → C . A generative mo del represents a distribution ov er im- ages, and in eﬀect is a neural net work which takes as input a random noise in the form of a p -dimensional v ector x and pro duces an image G ( x ). Hence it rep- resen ts a function G : R p → R d . W e typ ically use a Gaussian distribution for the random noise, written x ∼ N (0 , 1). Our ﬁrst deﬁnition is of glob al c orr e ctness , it relies on a ﬁrst key but simple idea, whic h is to compose a genera- tiv e mo del G with an image classiﬁer C : we construct a new neural net w ork C ◦ G by simply rewiring the output of G to the input of C , so C ◦ G represen ts a distribution o v er categories. Indeed, it tak es as input a random noise and outputs a category . Figure 1: Composition of a generativ e mo del with an image classiﬁer Deﬁnition 1 (Global Correctness) . Given for e ach c ∈ C a gener ative mo del G c for images of c ate gory c , we say that the image classiﬁer C is δ -c orr e ct with r esp e ct to the gener ative mo dels ( G c ) c ∈ C if for e ach c ∈ C , P x ∼N (0 , 1) ( C ◦ G c ( x ) = c ) ≥ 1 − δ. In wor ds, the pr ob ability that for a noise x the image gener ate d (using G c ) is c orr e ctly classiﬁe d (by C ) is at le ast 1 − δ . Assumptions Our deﬁnition of global correctness hinges on tw o prop erties of generative mo dels: 1. generative mo dels pro duce a wide v ariet y of im- ages, 2. generative mo dels pro duce (almost only) realistic images. T ec hnical report The ﬁrst assumption is the reason for the success of generativ e adversarial net works (GAN) (Goo d- fello w et al. (2014)). W e refer for instance to Karras et al. (2018) and to the attached web- site thispersondoesnotexist.com for a demo. In our exp erimen ts the generative mo dels we used are out of the shelf generativ e adv ersarial net works (GAN) (Go odfellow et al. (2014)), with 4 hidden lay ers of re- sp ectiv ely 256, 512 , 1024, and 784 no des, pro ducing images of single digits. T o test the second assumption we p erformed a ﬁrst exp erimen t called the manual sc or e exp eriment . W e pic ked 100 digit images using a generative mo del and ask ed 5 individuals to tell for each of them whether they are “near-p erfect”, “p erturb ed but clearly iden- tiﬁable”, “hard to iden tify”, or “rubbish”, and whic h digit they represen t. The results are that 96 images w ere correctly identiﬁed; among them 63 images were declared “near-p erfect” b y all individuals, with an- other 26 including “p erturbed but clearly iden tiﬁable”, and 8 were considered “hard to identify” by at least one individual y et correctly identiﬁed. The remaining 4 w ere “rubbish” or incorrectly identiﬁed. It follows that against this generativ e model, w e should require an image classiﬁer to b e at least . 89-correct, and even . 96-correct to matc h human p erception. Algorithm T o c heck whether a classiﬁer is δ -correct, the Mon te Carlo integration metho d is a natural approach: w e sample n random noises x 1 , . . . , x n , and count for how man y x i ’s we hav e that C ◦ G c ( x ) = c . The central limit theorem states that the ratio of p ositiv es ov er n conv erges to P x ∼N (0 , 1) ( C ◦ G c ( x ) = c ) as 1 √ n . It follo ws that n = 10 4 samples gives a 10 − 2 precision on this n umber. In practice, rather than sampling the random noises indep enden tly , w e form (large) batches and leverage the tensor-based computation, enabling eﬃcient GPU computation. 3 Global Robustness W e introduce the notion of global robustness, which giv es stronger guarantees than global correctness. In- deed, it includes the notion of p erturbations for im- ages. The usual notion of robustness, whic h we call here lo c al r obustness , can b e deﬁned as follows. Deﬁnition 2 (Lo cal Robustness) . We say that the image classiﬁer C is ε -r obust ar ound the image y ∈ R d of c ate gory c if ∀ y 0 , || y − y 0 || ≤ ε = ⇒ C ( y 0 ) = c. In wor ds, al l ε -p erturb ations of y ar e c orr e ctly classi- ﬁe d (by C ). One important asp ect in this deﬁnition is the choice of the norm for the perturbations (here w e use the inﬁnit y norm). W e ignore this as it will not play a role in our deﬁnition of robustness. A wealth of techniques ha ve b een dev elop ed for chec king lo cal robustness of neural netw orks, with state of the art to ols b eing able to handle nets with thousands of neurons. Assumptions Our deﬁnition of global robu stness is supp orted b y the t wo properties of generative models discussed ab o v e in the con text of global correctness, plus a third one: 3. generative models produce p erturbations of real- istic images. T o illustrate this w e designed a second exp erimen t called the r andom walk exp eriment : we perform a ran- dom walk on the space of random noises while observ- ing the ensued sequence of images pro duced by the generativ e mo del. More sp eciﬁcally , w e pick a ran- dom noise x 0 , and deﬁne a sequence ( x i ) i ≥ 0 of ran- dom noises with x i +1 obtained from x i b y adding a small random noise to x i ; this induces the sequence of images ( G ( x i )) i ≥ 0 . The result is b est visualised in an animated GIF (see the Gith ub repository), see also the ﬁrst 16 images in Figure 2. This supp orts the claim that images pro duced with similar random noises are (often) close to each other; in other words the genera- tiv e mo del is (almost ev erywhere) contin uous. Our deﬁnition of global robustness is reminiscent of the pr ovably appr oximately c orr e ct learning framew ork de- v elop ed b y V aliant (1984). It features tw o parameters. The ﬁrst parameter, δ , quantiﬁes the probabilit y that a generative mo del pro duces a realistic image. The second parameter, ε , measures the p erturbations on the noise, which by the con tin uity prop ert y discussed ab o v e transfers to p erturbations of the pro duced im- ages. Deﬁnition 3 (Global Robustness) . Given for e ach c ∈ C a gener ative mo del G c for images of c ate gory c , we say that the image classiﬁer C is ( ε, δ ) -r obust with r esp e ct to the gener ative mo dels ( G c ) c ∈ C if for e ach c ∈ C , P x ∼N (0 , 1) ( ∀ x 0 , || x − x 0 || ≤ ε = ⇒ C ◦ G c ( x 0 ) = c ) ≥ 1 − δ. In wor ds, the pr ob ability that for a noise x , al l ε - p erturb ations of x gener ate (using G ) images c orr e ctly classiﬁe d (by C ) is at le ast 1 − δ . T ec hnical report Figure 2: The random walk exp eriment Algorithm T o c heck whether a classiﬁer is ( ε, δ )-robust, we extend the previous ideas using the Mon te Carlo integration: w e sample n random noises x 1 , . . . , x n , and coun t for ho w many x i ’s the follo wing prop ert y holds: ∀ x, || x i − x || ≤ ε = ⇒ C ◦ G c ( x ) = c. The central limit theorem states that the ratio of p os- itiv es ov er n conv erges to P x ∼N (0 , 1) ( ∀ x 0 , || x − x 0 || ≤ ε = ⇒ C ◦ G c ( x 0 ) = c ) as 1 √ n . As b efore, it follows that n = 10 4 samples giv es a 10 − 2 precision on this n umber. In other w ords, c hec king global robustness reduces to com bining Mon te Carlo in tegration with c hec king lo cal robustness. 4 Exp erimen ts The code for all exp erimen ts can b e found on the Gith ub rep ository https://github.com/mohitiitb/ NeuralNetworkVerification_GlobalRobustness . All experiments are presented in Jup yter noteb ook for- mat with pre-trained mo dels to be easily repro duced. Our exp erimen ts are all repro duced on the drop-in F ashion-MNIST dataset (Xiao et al. (2017)), obtaining similar results. W e report on exp erimen ts designed to assess the b en- eﬁt of these tw o notions, whose common denominator is to go from a lo cal prop ert y to a global one by com- p osing with a generative mo del. W e ﬁrst ev aluate the global correctness of sev eral im- age classiﬁers, sho wing that it provides a ﬁner wa y of ev aluating them than the usual test set. W e then turn to global robustness and sho w ho w the negation of robustness can b e witnessed by realistic adversarial examples. The second set of exp erimen ts addresses the fact that b oth global correctness and robustness notions dep end on the choice of a generative mo del. W e show that this dep endence can b e made small, but that it can also b e used for reﬁning the correctness and robustness notions. Choice of net works In all the exp erimen ts, our base case for image classiﬁers hav e 3 hidden lay ers of increasing capaci- ties: the ﬁrst one, referred to as “small”, has la yers with (32 , 64 , 200) (n umber of no des), “medium” corre- sp onds to (64 , 128 , 256), and “large” to (64 , 128 , 512). The generative model are as described abov e, with 4 hidden la yers of respectively 256, 512 , 1024, and 784 no des. F or each of these three architectures w e either use the standard MNIST training set (6,000 images of each digit), or an augmen ted training set (24,000 images), obtained b y rotations, shear, and shifts. The same distinction applies to GANs: the “simple GAN” uses the standard training set, and the “augmented GAN” the augmen ted training set. Finally , we work with t w o netw orks obtained through robust training pro cedures. The ﬁrst one w as prop osed b y M ¸ adry et al. (2018) for the MNIST Adversarial Example Challenge (the goal of the challenge w as to ﬁnd adversarial examples, see b elo w), and the second one was deﬁned by Papernot et al. (2016) through the pro cess of defense distillation. Ev aluating Global Correctness W e ev aluated the global correctness of all the image classiﬁers mentioned ab o ve against simple and aug- men ted GANs, and rep orted the results in the table b elo w. The last column is the usual v alidation pro- cedure, meaning the num ber of correct classiﬁcation on the MNIST test set of 10,000 images. They all p erform very well, and close to p erfectly (ab ov e 99%), against this metric, hence cannot b e distinguished. Y et the composition with a generativ e model rev eals that their p erformance outside of the test set are actually diﬀeren t. It is instructiv e to study the outliers for each image classiﬁer, i.e. the generated images which are incorrectly classiﬁed. W e refer to the Github reposi- tory for more exp erimen tal results along these lines. Finding Realistic Adv ersarial Examples Chec king the global robustness of an image classiﬁer is out of reach for state of the art v eriﬁcation to ols. Indeed, a single robustness chec k on a medium size net takes somewhere betw een dozens of seconds to a T ec hnical report Classiﬁer simple GAN augmen ted GAN test set Standard training set small 98.89 92.82 99.79 medium 99.15 93.16 99.76 large 99.38 93.80 99.80 Augmen ted training set small 97.84 95.2 99.90 medium 99.11 96.53 99.86 large 99.25 97.66 99.84 Robust training pro cedures M ¸ adry et al. (2018) 98.87 93.17 99.6 P ap ernot et al. (2016) 99.64 94.78 99.17 few minutes, and to get a decent appro ximation we need to perform tens of thousands lo cal robustness c hecks. Hence with considerable computational eﬀorts w e could analyse one image classiﬁer, but could not p erform a wider comparison of diﬀeren t training pro- cedures and inﬂuence on diﬀeren t aspects. Thus our exp erimen ts fo cus on the negation of robustness, whic h is ﬁnding realistic adv ersarial examples, that w e deﬁne no w. Deﬁnition 4 (Realistic Adversarial Example) . An ε - r e alistic adversarial example for an image classiﬁer C with r esp e ct to a gener ative mo del G is an image G ( x ) such that ther e exists another image G ( x 0 ) with || x − x 0 || ≤ ε and C ◦ G ( x ) 6 = C ◦ G ( x 0 ) In wor ds, x and x 0 ar e two ε -close r andom noises which gener ate images G ( x ) and G ( x 0 ) that ar e classiﬁe d dif- fer ently by C . Note that a realistic adversarial example is not nec- essarily an adv ersarial example: the images G ( x ) and G ( x 0 ) may diﬀer b y more than ε . Ho wev er, this is the assumption 3. discussed when deﬁning global ro- bustness, if x and x 0 are close, then typic al ly G ( x ) and G ( x 0 ) are tw o very resemblan t images, so the tw o notions are indeed close. W e introduce t wo algorithms for ﬁnding realistic ad- v ersarial examples, which are directly inspired by al- gorithms developed for ﬁnding adversarial examples. The k ey diﬀerence is that realistic adversarial exam- ples are searc hed b y analysing the composed net work C ◦ G . Let us consider tw o digits, for the sake of explanation, 3 and 8. W e ha ve a generativ e mo del G 8 generating images of 8 and an image classiﬁer C . The ﬁrst algorithm is a black-b ox attack , meaning that it does not hav e access to the inner structure of the net works and it can only simulate them. It consists in sampling random noises, and performing a local searc h for a few steps. F rom a random noise x , w e inspect the random noise x + δ for a few small random noises δ , and choose the random noise x 0 maximising the score of 3 by the net C ◦ G 8 , written C ◦ G 8 ( x i )[3] in the pseudo code given in Algorithm 1. The algorithm is rep eatedly run un til a realistic adv ersarial example is found. Algorithm 1: The black-box attack for the digits 3 and 8. Data: A generativ e mo del G 8 and an image classiﬁer C . A parameter ε > 0. N step ← 16 (n umber of steps) N dir ← 10 (n umber of directions) x 0 ∼ N (0 , 1) for i = 0 to N step − 1 do s max ← C ◦ G 8 ( x i )[3] (score of 3) x i +1 ← x i for j = 0 to N dir − 1 do δ j ∼ N (0 , ε N step ) s ← C ◦ G 8 ( x i + δ j )[3] if s > s max then s max ← s x i +1 ← x i + δ j if C ◦ G 8 ( x 0 ) 6 = C ◦ G 8 ( x i +1 ) then return x 0 ( ε -realistic adv ersarial example) The second algorithm is a white-b ox attack , meaning that it uses the inner structure of the netw orks. It is similar to the previous one, except that the lo cal searc h is replaced by a gradient ascent to maximise the score of 3 by the net C ◦ G 8 . In other words, instead of c ho osing a direction at random, it follo ws the gradien t to maximise the score. It is reminiscent of the pro jected gradien t descen t (PGD) attac k, but p erformed on the comp osed net work. The pseudo code is giv en in Algorithm 2. Both attacks successfully ﬁnd realistic adversarial ex- amples within less than a min ute. The adjective “real- istic”, which is sub jective, is justiﬁed as follows: most attac ks constructing adversarial examples create un- T ec hnical report Algorithm 2: The white-b o x attack for the digits 3 and 8. Data: A generativ e mo del G 8 and an image classiﬁer C . A parameter ε > 0. N step ← 16 (n umber of steps) α ← ε N step (step) x 0 ∼ N (0 , 1) for i = 0 to N step − 1 do x i +1 ← x i − α · Grad C ◦ G 8 ( x i )[3] if C ◦ G 8 ( x 0 ) 6 = C ◦ G 8 ( x i +1 ) then return x 0 ( ε -realistic adv ersarial example) realistic images b y adding noise or mo difying pixels, while with our deﬁnition the realistic adversarial ex- amples are images pro duced b y the generative mo del, hence p oten tially more realistic. See Figure 3 for some examples. On the Dep endence on the Generativ e Mo del Both global correctness and robustness notions are de- ﬁned with respect to a generative model. This raises a question: how m uch do es it dep end on the choice of the generativ e mo del? T o answ er this question w e trained t wo GANs using the exact same training procedure but with t w o dis- join t training sets, and used the tw o GANs to ev aluate sev eral image classiﬁers. The outcome is that the tw o GANs yield sensibly the same results against all image classiﬁers. This suggests that the glob al correctness in- deed do es not depend dramatically on the c hoice of the generativ e model, provided that it is reasonably goo d and well-trained. W e refer to the Github rep ository for a complete exp osition of the results. Since the training set of the MNIST dataset contains 6,000 images of each digit, splitting it in t wo would not yield tw o large enough training sets. Hence we used the extended MNIST (EMNIST) dataset Cohen et al. (2017), which provided us with (roughly) 34,000 images of each digit, hence tw o disjoint datasets of ab out 17,000 images. On the Inﬂuence of Data Augmen tation Data augmentation is a classical tec hnique for increas- ing the size of a training set, it consists in creating new training data by applying a set of mild transformations to the existing training set. In the case of digit images, common transformations include rotations, shear, and shifts. Unsurprisingly , crossing the t wo training sets, e.g. us- ing the standard training set for the image classiﬁer and an augmented one for the generative mo del yields w orse results than when using the same training set. More interestingly , the robust netw orks M ¸ adry et al. (2018); Papernot et al. (2016), which are trained us- ing an improv ed pro cedure but based on the standard training set, p erform well against generative mo dels trained on the augmen ted training set. In other w ords, one outcome of the improv ed training pro cedure is to b etter capture the natural image transformations, ev en if they were never used in training. 5 Conclusions W e deﬁned t wo notions: global correctness and global robustness, based on generative mo dels, aiming at quan tifying the usability of an image classiﬁer. W e p erformed some exp erimen ts on the MNIST dataset to understand the merits and limits of our deﬁnitions. An imp ortan t c hallenge lies ahead: to make the ver- iﬁcation of global robustness doable in a reasonable amoun t of time and computational eﬀort. Bibliograph y Dan Ciregan, Ueli Meier, and Juergen Schmidh u- b er. Multi-column deep neural netw orks for im- age classiﬁcation. In IEEE Confer enc e on Com- puter Vision and Pattern R e c o gnition (CCVPR) , pages 3642–3649, June 2012. doi: 10.1109/CVPR. 2012.6248110. URL https://ieeexplore.ieee. org/document/6248110 . Gregory Cohen, Saeed Afshar, Jonathan T apson, and Andr ´ e v an Schaik. EMNIST: an extension of MNIST to handwritten letters. CoRR , abs/1702.05373, 2017. URL . Timon Gehr, Matthew Mirman, Dana Drachsler- Cohen, P etar Tsanko v, Sw arat Chaudhuri, and Mar- tin T. V ec hev. AI2: safety and robustness certiﬁca- tion of neural netw orks with abstract in terpretation. In IEEE Symp osium on Se curity and Privacy (SP) , pages 3–18, 2018. doi: 10.1109/SP .2018.00058. URL https://doi.org/10.1109/SP.2018.00058 . Ian J. Go odfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David W arde-F arley , Sherjil Ozair, Aaron C. Courville, and Y oshua Bengio. Genera- tiv e adv ersarial nets. In Confer enc e on Neur al In- formation Pr o c essing Systems (NIPS) , pages 2672– 2680, 2014. URL http://papers.nips.cc/paper/ 5423- generative- adversarial- nets . Divy a Gopinath, Guy Katz, Corina S. Pasarean u, and Clark Barrett. Deepsafe: A data-driv en ap- proac h for assessing robustness of neural net w orks. In Symp osium on A utomate d T e chnolo gy for V er- iﬁc ation and Analysis (A TV A) , pages 3–19, 2018. T ec hnical report Figure 3: Examples of realistic adversarial examples. On the left hand side, against the smallest net, and on the righ t hand side, against M ¸ adry et al. (2018) doi: 10.1007/978- 3- 030- 01090- 4 \ 1. URL https: //doi.org/10.1007/978- 3- 030- 01090- 4_1 . Xiao wei Huang, Marta Kwiatko wsk a, Sen W ang, and Min W u. Safet y v eriﬁcation of deep neural netw orks. In Computer-Aide d V eriﬁc a- tion (CA V) , pages 3–29, 2017. doi: 10.1007/ 978- 3- 319- 63387- 9 \ 1. URL https://doi.org/10. 1007/978- 3- 319- 63387- 9_1 . T ero Karras, Samuli Laine, and Timo Aila. A style- based generator architecture for generative adver- sarial netw orks. CoRR , abs/1812.04948, 2018. URL http://arxiv.org/abs/1812.04948 . Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, and Myk el J. Ko chenderfer. Reluplex: An eﬃcien t SMT solv er for v erifying deep neu- ral netw orks. In Computer-Aide d V eriﬁc ation (CA V) , pages 97–117, 2017. doi: 10.1007/ 978- 3- 319- 63387- 9 \ 5. URL https://doi.org/10. 1007/978- 3- 319- 63387- 9_5 . Guy Katz, Derek A. Huang, Duligur Ib eling, Kyle Ju- lian, Christopher Lazarus, Rachel Lim, Parth Shah, Shan tanu Thakoor, Haoze W u, Aleksandar Zeljic, Da vid L. Dill, Mykel J. Kochenderfer, and Clark W. Barrett. The marab ou framework for veriﬁcation and analysis of deep neural netw orks. In Computer- A ide d V eriﬁc ation (CA V) , pages 443–452, 2019. doi: 10.1007/978- 3- 030- 25540- 4 \ 26. URL https: //doi.org/10.1007/978- 3- 030- 25540- 4_26 . Aleksander M ¸ adry , Aleksandar Mak elov, Ludwig Sc hmidt, Dimitris Tsipras, and Adrian Vladu. T o- w ards deep learning mo dels resistant to adversar- ial attacks. In International Confer enc e on L e arn- ing R epr esentations (ICLR) , 2018. URL https: //openreview.net/forum?id=rJzIBfZAb . Matthew Mirman, Timon Gehr, and Martin T. V ec hev. Diﬀerentiable abstract interpretation for pro v ably robust neural netw orks. In International Confer enc e on Machine L e arning (ICML) , pages 3575–3583, 2018. URL http://proceedings.mlr. press/v80/mirman18b.html . Nicolas Papernot, P atrick D. McDaniel, Xi W u, Somesh Jha, and Ananthram Swami. Distillation as a defense to adv ersarial p erturbations against deep neural net works. In IEEE Symp osium on Se- curity and Privacy (SP) , pages 582–597, 2016. doi: 10.1109/SP .2016.41. URL https://doi.org/10. 1109/SP.2016.41 . Christian Szegedy , W o jciec h Zaremba, Ilya Sutsk ever, Joan Bruna, Dumitru Erhan, Ian J. Go odfellow, and Rob F ergus. Intriguing prop erties of neural net works. In International Confer enc e on L e arn- ing R epr esentations (ICLR) , 2014. URL http: //arxiv.org/abs/1312.6199 . Leslie G. V aliant. A theory of the learnable. Com- munic ations of the ACM , 27(11):1134–1142, 1984. doi: 10.1145/1968.1972. URL https://doi.org/ 10.1145/1968.1972 . Li W an, Matthew Zeiler, Sixin Zhang, Y ann Le Cun, and Rob F ergus. Regularization of neural net w orks using dropconnect. In International Confer enc e on Machine L e arning (ICML) , v olume 28, pages 1058–1066, 2013. URL http://proceedings.mlr. press/v28/wan13.html . Tsui-W ei W eng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane S. Boning, and Inderjit S. Dhillon. T o wards fast computation of certiﬁed robustness for relu netw orks. In Inter- national Confer enc e on Machine L e arning (ICML) , pages 5273–5282, 2018. URL http://proceedings. mlr.press/v80/weng18a.html . Chao wei Xiao, Bo Li, Jun-Y an Zhu, W arren He, Mingy an Liu, and Da wn Song. Generating ad- v ersarial examples with adversarial net works. In Pr o c e e dings of the Twenty-Seventh International Joint Confer enc e on Artiﬁcial Intel ligenc e (IJCAI) , pages 3905–3911, 2018. doi: 10.24963/ijcai.2018/ 543. URL https://doi.org/10.24963/ijcai. 2018/543 . Han Xiao, Kashif Rasul, and Roland V ollgraf. F ashion-MNIST: a nov el image dataset for b enc h- marking mac hine learning algorithms. CoRR , T ec hnical report abs/1708.07747, 2017. URL abs/1708.07747 .

Verification of Neural Networks: Specifying Global Robustness using Generative Models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment