IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness

We propose IrisFP, a novel adversarial-example-based model fingerprinting framework that enhances both uniqueness and robustness by leveraging multi-boundary characteristics, multi-sample behaviors, and fingerprint discriminative power assessment to …

Authors: Ziye Geng, Guang Yang, Yihang Chen

IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness
IrisFP: Adversarial-Example-based Model Finger printing with Enhanced Uniqueness and Rob ustness Ziye Geng 1 , Guang Y ang 2 , Y ihang Chen 1 , and Changqing Luo 1 , 1 Uni versity of Houston, 2 V ir ginia Commonwealth Uni versity { zgeng2, ychen165, cluo3 } @uh.edu, yangg2@vcu.edu Abstract W e pr opose IrisFP , a novel adversarial-e xample-based model fingerprinting frame work that enhances both unique- ness and r obustness by lever aging multi-boundary c harac- teristics, multi-sample behaviors, and fingerprint discrimi- native power assessment to generate composite-sample fin- gerprints. Three ke y inno vations make IrisFP outstanding: 1) It positions fingerprints near the intersection of all deci- sion boundaries—unlike prior methods that tar get a single boundary—thus incr easing the prediction mar gin without placing fingerprints deep inside tar get class r e gions, en- hancing both r obustness and uniqueness; 2) It constructs composite-sample fingerprints, each comprising multiple samples close to the multi-boundary intersection, to exploit collective behavior patterns and further boost uniqueness; and 3) It assesses the discriminative power of generated fingerprints using statistical separability metrics de veloped based on two refer ence model sets, r espectively , for pirated and independently-trained models, retains the fing erprints with high discriminative power , and assigns fingerprint- specific thresholds to such retained fingerprints. Exten- sive experiments show that IrisFP consistently outperforms state-of-the-art methods, achieving r eliable owner ship ver - ification by enhancing both r obustness and uniqueness. 1. Introduction Adversarial-e xample-based model fingerprinting has re- cently gained attention as a promising approach to protect intellectual property (IP) of deep neural networks (DNNs) dev eloped for many tasks like image classification. This type of model fingerprinting b uilds on the adversarial e x- ample technique to generate input-output pairs as finger - prints that exhibit distincti ve model-specific behavior . More specifically , such a fingerprinting approach introduces sub- tle perturbations to clean inputs to elicit unique model- specific responses from a protected model: the protected model produces outputs that deviate from those induced by the original clean inputs, whereas independently-trained models tend to produce outputs consistent with the clean in- puts and significantly different from those of the protected model [ 3 , 15 , 21 , 23 , 25 , 38 ]. These crafted fingerprints are then utilized for ownership v erification [ 5 , 37 ]. So far , researchers have dev eloped many adversarial- example-based model fingerprinting methods [ 3 , 6 , 19 , 20 , 23 , 30 , 32 , 33 ], which typically generate indi vidual finger- prints near a single decision boundary , without consider- ing their relati ve distances to other boundaries. For ex- ample, IPGuard [ 3 ] crafts fingerprints close to the deci- sion boundaries of the protected model to capture model- specific behaviors; UAP [ 23 ] le verages uni v ersal perturba- tions to induce model-dependent outputs over a broad set of inputs; AD V -TRA [ 30 ] constructs adversarial trajecto- ries composed of samples that tra v erse boundary regions to capture rich model characteristics; and AKH [ 6 ] constructs fingerprints by generating adv ersarial examples that origi- nate from naturally misclassified samples. Howe ver , adversarial-example-based model fingerprint- ing is inherently prone to model modification attacks like fine-tuning and pruning, as such attacks can shift the deci- sion boundaries of protected models, consequently circum- venting ownership verification [ 4 , 9 , 18 ]. T o address this issue, prior methods propose placing fingerprints deep in- side the regions of target classes—far away from decision boundaries—to enhance robustness against model modifi- cations [ 3 , 20 ]. Ne vertheless, this strate gy may in turn un- dermine uniqueness, as such fingerprints are less sensiti ve to model-specific decision boundaries. Consequently , e xist- ing methods hav e weak uniqueness or robustness, or even both. This observation has been corroborated by our exper- imental results in Figure 1 . The experiments, performed on CIF AR-10 and Fashion-MNIST , assess uniqueness by measuring true negativ e rates (TNRs) under v arying false negati ve rate (FNR) lev els, and robustness by measuring true positiv e rates (TPRs) under different false positiv e rate (FPR) lev els. A higher TNR indicates better uniqueness in distinguishing the protected model from independently- trained ones, while a higher TPR indicates stronger robust- ness to model modifications. The results sho w that con- ventional methods can achiev e either weak uniqueness or robustness, which highlights the inherent difficulty of gen- erating effecti ve fingerprints for ownership v erification. 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV-TRA 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV-TRA (a) (b) (c) (d) Figure 1. The uniqueness and robustness achieved by prior adversarial-e xample-based fingerprinting methods. Main question . Based on the discussion so far , we hav e the following research question: Is it possible to de- sign a ne w adversarial-example-based model fingerprinting method that enhances both uniqueness and r obustness? 1.1. Our Answer In this paper , we giv e an af firmati ve answer to the above question. Recent studies [ 1 , 2 ] show that compared to fin- gerprints positioned on a single decision boundary , those placed at the intersection of multi-class decision bound- aries has high sensitivity , i.e., achieving enhanced unique- ness. By leveraging this property , IBSF [ 1 ] and SDBF [ 2 ] were dev eloped to detect model tampering. While the fin- gerprints placed at the intersection of multi-class decision boundaries can enhance uniqueness, the y are extremely vul- nerable to model modifications, i.e., struggling with weak robustness. Therefore, our question is: Can we exploit de- cision boundary intersection to design a ne w fingerprinting method with enhanced uniqueness and r obustness? T o positively answer this question, we first provide an in- tuitiv e illustration by Figure 2 to sho w that proximity to all decision boundaries is crucial for enhancing rob ustness. As illustrated in Figure 2 (a), sample s 1 lies within the region of class c 2 , positioned near the boundary between c 2 and c 4 , but is far from the other regions, while s 2 , which also belongs to c 2 , is located near the intersection of multiple boundaries, making it simultaneously close to all surround- ing class regions. The corresponding output distributions for s 1 and s 2 , sho wn in Figure 2 (b) and Figure 2 (c), re- veal that s 2 has a larger prediction margin—that is, a greater confidence gap between the top two predicted classes. This increased margin means enhancing rob ustness at no cost of uniqueness. This key observ ation moti v ates our design: to construct fingerprints that are not only model-specific but also strategically placed near the multi-boundary intersec- tion to improv e uniqueness and robustness. c ! c " c # c $ c % 𝑝𝑚 ! probability c ! c " c # c $ c % 𝑝𝑚 " probability (c) Probability distribution of 𝑠 ! c ! c " c # c $ c % 𝑠 $ 𝑠 " Decision boundary (a) F ingerprint 𝑠 " near single decision boundary vs fingerprint 𝑠 ! near multi -calss decision boundaries (b) Probability distribution of 𝑠 " Figure 2. (a): The placement of input samples s 1 and s 2 in a region; (b): The prediction margin pm 1 of fingerprint s 1 ; and (c): The prediction margin pm 2 of fingerprint s 2 . Our Design: W e propose IrisFP , a novel model fin- gerprinting framew ork that generates unIque and robust composite-sample FingerPrints (IrisFP) by exploiting multi-boundary characteristics, multi-sample behaviors, and fingerprint quality assessment. Unlike con ventional methods that place each fingerprint near a single decision boundary , IrisFP first generates fingerprint seeds by craft- ing adversarial samples to reside near the intersection of all decision boundaries of a protected model. This strat- egy increases the prediction margin without placing the sample deep inside the target class region, thus enhancing both robustness and uniqueness. T o further boost unique- ness, IrisFP applies subtle, div erse perturbations to each fin- gerprint seed, generating multiple variants which, together with the seed, collectively form a composite-sample fin- gerprint. Compared to the original sample, the seed and its variants are crafted to elicit different responses from the protected model, while tending to produce consistent predictions by independently-trained ones, resulting in a clear behavioral gap that enhances the fingerprint’ s dis- criminativ e capability . Moreo ver , IrisFP addresses a key limitation in prior adversarial-example-based fingerprint- ing methods—their lack of consideration for model mod- ifications and independently trained models during finger - print construction. T o o vercome this, IrisFP employs a fin- gerprint refinement strategy that e v aluates the discrimina- tiv e power of each fingerprint using statistical separabil- ity metrics to retain the fingerprints with high discrimina- tiv e power , and assigns sample-specific thresholds to the re- tained fingerprints. Finally , IrisFP performs model owner - ship verification through a two-step process in volving o wn- ership matching and decision aggregation, specifically de- signed for composite-sample fingerprints. 2. Preliminary 2.1. Model Fingerprinting An adversarial-example-based fingerprinting approach typ- ically allows a model owner to craft inputs that elicit unique Phase II :Ge nerating composite-sample fingerprints Phase I : In itializing fingerprint seed s Fingerprint Generation Owner shi p Veri f ic ati on 𝑐 ! 𝑐 " 𝑐 # 𝑐 $ 𝑐 % Calcula te fingerprint - specific threshold {𝜃 ! , 𝜃 " , ⋯ , 𝜃 # } Suspect Model .. Calculate matching rate Calculate 𝑓 ) Fingerprint seed Samples Calculate Cohen ’s d effect size 𝑘 Final refined fingerprint set Select top-k fingerprints Phase III : Refining fingerprints 𝑐 % , 𝑐 & , 𝑐 " , 𝑐 & , 𝑐 ' , 𝑐 ( 𝑐 ' , 𝑐 ( , 𝑐 & , 𝑐 % , 𝑐 " , 𝑐 % 𝑐 " , 𝑐 % , 𝑐 " , 𝑐 ' , 𝑐 ( , 𝑐 & ⋯ ⋯ ⋯ Indepen dent model set Calculate matching rate Pirated model set ⋯ Step 1 Step 2 Step 3 Step 1 ⋯ ⋯ Composite-sample finge rprints Calculate verification sco re Pirated Independent Step 2 AAACAXicbVDLSsNAFJ3UV62vqBvBzWARqkhJRKrLohs3QtW+oI1hMp20QyeTMDMRSqgbf8WNC0Xc+hfu/BsnbRZqPXDhcM693HuPFzEqlWV9Gbm5+YXFpfxyYWV1bX3D3NxqyjAWmDRwyELR9pAkjHLSUFQx0o4EQYHHSMsbXqR+654ISUNeV6OIOAHqc+pTjJSWXHPn9uqm5LvqqBsgNcCIJfWxS+8OD1yzaJWtCeAssTNSBBlqrvnZ7YU4DghXmCEpO7YVKSdBQlHMyLjQjSWJEB6iPuloylFApJNMPhjDfa30oB8KXVzBifpzIkGBlKPA053pnfKvl4r/eZ1Y+WdOQnkUK8LxdJEfM6hCmMYBe1QQrNhIE4QF1bdCPEACYaVDK+gQ7L8vz5LmcdmulCvXJ8XqeRZHHuyCPVACNjgFVXAJaqABMHgAT+AFvBqPxrPxZrxPW3NGNrMNfsH4+AY57pYg SM R ( f t , T → i ) AAACAHicbVDLSsNAFJ3UV62vqAsXbgaLUEVKIlJdFt24Ear0BW0Mk+nEDp1MwsxEKCEbf8WNC0Xc+hnu/BsnbRbaeuDC4Zx7ufceL2JUKsv6NgoLi0vLK8XV0tr6xuaWub3TlmEsMGnhkIWi6yFJGOWkpahipBsJggKPkY43usr8ziMRkoa8qcYRcQL0wKlPMVJacs29m7uK76qTfoDUECOWNFOX3h8fuWbZqloTwHli56QMcjRc86s/CHEcEK4wQ1L2bCtSToKEopiRtNSPJYkQHqEH0tOUo4BIJ5k8kMJDrQygHwpdXMGJ+nsiQYGU48DTndmdctbLxP+8Xqz8CyehPIoV4Xi6yI8ZVCHM0oADKghWbKwJwoLqWyEeIoGw0pmVdAj27MvzpH1atWvV2u1ZuX6Zx1EE++AAVIANzkEdXIMGaAEMUvAMXsGb8WS8GO/Gx7S1YOQzu+APjM8fj2+Vww== MR ( f t , T → i ) AAAB/nicbVDLSsNAFL2pr1pfUXHlJliEKlISkeqy6MZlxb6gjWUynbRDJ5MwMxFKKPgrblwo4tbvcOffOGmz0NYDA4dz7uWeOV7EqFS2/W3klpZXVtfy64WNza3tHXN3rynDWGDSwCELRdtDkjDKSUNRxUg7EgQFHiMtb3ST+q1HIiQNeV2NI+IGaMCpTzFSWuqZB837kt9TZ90AqSFGLKlPHk5PembRLttTWIvEyUgRMtR65le3H+I4IFxhhqTsOHak3AQJRTEjk0I3liRCeIQGpKMpRwGRbjKNP7GOtdK3/FDox5U1VX9vJCiQchx4ejJNKee9VPzP68TKv3ITyqNYEY5nh/yYWSq00i6sPhUEKzbWBGFBdVYLD5FAWOnGCroEZ/7Li6R5XnYq5crdRbF6ndWRh0M4ghI4cAlVuIUaNABDAs/wCm/Gk/FivBsfs9Gcke3swx8Ynz8VxJTx VS ( f t , T → ) AAACCXicbVDLSgNBEJyNrxhfqx69DAYhioRdkegx6MVjxLwgG0PvZDYZMvtwZlYIS65e/BUvHhTx6h9482+cTfag0YKGoqqb7i434kwqy/oycguLS8sr+dXC2vrG5pa5vdOUYSwIbZCQh6LtgqScBbShmOK0HQkKvstpyx1dpn7rngrJwqCuxhHt+jAImMcIKC31TNy8KXk9dez4oIYEeFKf3B4dOgN6hx3g0RB6ZtEqW1Pgv8TOSBFlqPXMT6cfktingSIcpOzYVqS6CQjFCKeTghNLGgEZwYB2NA3Ap7KbTD+Z4AOt9LEXCl2BwlP150QCvpRj39Wd6cFy3kvF/7xOrLzzbsKCKFY0ILNFXsyxCnEaC+4zQYniY02ACKZvxWQIAojS4RV0CPb8y39J86RsV8qV69Ni9SKLI4/20D4qIRudoSq6QjXUQAQ9oCf0gl6NR+PZeDPeZ605I5vZRb9gfHwDTgSZeg== VS ( f t , T → ) → ω Figure 3. The o vervie w of IrisFP . model-specific responses. Giv en a protected model f and a clean input x i with ground-truth label y i , a small pertur- bation δ i , which is constrained to capture model-specific boundary characteristics, is added to produce a perturbed input ˆ x i = x i + δ i , such that f outputs a target label ˆ y i  = y i , resulting in fingerprint ( ˆ x i , ˆ y i ) . W e mathematically formu- late this process by min ∥ x i − ˆ x i ∥ , s.t. f ( ˆ x i )  = y i , (1) where ∥· ∥ represents a distance metric (e.g., ℓ 2 or ℓ ∞ norm). This process can generate a set of fingerprints whose in- puts are typically located near the decision boundaries of model f , making the fingerprints’ behavior highly model- specific. The resulting fingerprints are used to query a tar - get model f t to produce query results that are compared to the expected outputs for determining o wnership. 2.2. Model Modification Attacks In real-w orld scenarios, an adversary may alter a protected model f through model modification techniques to create a pirated v ariant f p in order to e vade o wnership verification. Common modification techniques include fine-tuning (FT) [ 17 , 41 ], pruning (PR) [ 8 , 16 ], adversarial training (A T) [ 24 , 40 ], and knowledge distillation (KD) [ 27 ]. In prac- tice, adversaries may also combine these techniques—such as injecting noise or pruning before fine-tuning—to amplify the de gree of modification. Such alterations can shift the model’ s decision boundaries, potentially in validating exist- ing fingerprints and thereby enabling ev asion of verification [ 42 ]. For example, given a fingerprint ( ˆ x i , ˆ y i ) for the pro- tected model f , querying f p with ˆ x i might yield an output differing from the expected one, i.e., f p ( ˆ x i )  = ˆ y i = f ( ˆ x i ) , thus undermining the reliability of ownership v erification. 3. Methodology 3.1. Overview W e propose IrisFP , a novel model fingerprinting framew ork that enhances both uniqueness and rob ustness. As sho wn in Figure 3 , IrisFP comprises two main processes: finger- print generation and o wnership v erification. The fingerprint generation process includes three phases: fingerprint seed initialization, composite-sample fingerprint generation, and fingerprint set refinement. The first phase places finger - print seeds near the intersection of all decision boundaries to maximize prediction margins while k eeping model-specific behavior intact. The second phase then minimally perturbs each seed to produce a set of variants with di verse outputs that, along with the seed, form a composite-sample finger- print. The third phase further exploits statistical separabil- ity to select the most discriminativ e fingerprints from the generated ones to construct a fingerprint set, and assigns a specific threshold to each selected fingerprint. The o wnership verification process in volv es two steps: ownership matching and decision aggregation. At the first step, the matching rate across samples of each composite- sample fingerprint is computed and compared with its corresponding threshold to determine if the fingerprint is matched. At the second step, the matching outcomes across all fingerprints are aggregated to make a final decision. If the proportion of matched fingerprints exceeds a predefined threshold, the target model is deemed pirated; otherwise, it is identified as independently-trained. 3.2. The Fingerprint Generation Pr ocess 3.2.1. Phase I: Initializing Fingerprint Seeds Our proposed model fingerprinting framework starts with generating initial fingerprint seeds. Gi ven a protected model f trained on a dataset D with C classes, we first randomly select N f input-label pairs { ( x 0 i , y 0 i ) } N f i =1 from D , where x 0 i is an input with ground-truth label y 0 i . F or each x 0 i , our adversarial-e xample-based model fingerprinting method in- troduces a trainable perturbation δ 0 i to produce a fingerprint seed ˆ x 0 i = x 0 i + δ 0 i . Unlike traditional fingerprinting meth- ods that directly push a fingerprint to ward a single decision boundary associated with a specific target class, our method instead guides ˆ x 0 i to be located near the intersection of f ’ s decision boundaries. This design enhances robustness: po- sitioning samples close to such an intersection enables f to predict the target class ˆ y 0 i with high confidence, while distributing the remaining probability almost evenly across other classes. Thus, model-specific output behavior is pre- served, and the prediction margin is increased, improving robustness to modifications without sacrificing uniqueness. T o obtain such seeds, we define a biased probability dis- tribution p i ∈ R C for each sample with respect to a ran- domly chosen tar get class ˆ y 0 i , with p i ( y ) denoting the prob- ability assigned to class y : p i ( y ) = ( 1 C + τ , if y = ˆ y 0 i , 1 − ( 1 C + τ ) C − 1 , otherwise , (2) where 0 < τ < 1 is a tunable hyperparameter controlling the degree of bias to ward ˆ y 0 i . Then, we define f o ( x ) as the softmax output of the model f , which produces a proba- bility distrib ution o ver C classes. The fingerprint seed ˆ x 0 i is then optimized to ensure that the model’ s output distribution f o ( ˆ x 0 i ) aligns with the predefined p i . This is achie ved by minimizing the K ullback-Leibler (KL) di ver gence between f o ( ˆ x 0 i ) and p i , with an L 1 regularization term applied to minimize the perturbation. The loss function is gi ven by L phase1 = KL ( f o ( ˆ x 0 i ) || p i ) + λ 1 ∥ δ 0 i ∥ 1 , where λ 1 is a regu- larization coef ficient. This optimization process yields a set of fingerprint seeds { ( ˆ x 0 i , ˆ y 0 i ) } N f i =1 , each positioned near the intersection of multi-class decision boundaries. 3.2.2. Phase II: Generating Composite-sample Finger- prints T o enhance uniqueness, we extend each fingerprint seed into a composite-sample fingerprint comprising multiple minimally perturbed variants designed to elicit diverse re- sponses from the protected model. Specifically , each fin- gerprint seed ˆ x 0 i is perturbed by a set of small, trainable perturbations { δ 1 i , δ 2 i , . . . , δ T i } to generate a set of variants { ˆ x 1 i , ˆ x 2 i , · · · , ˆ x T i } , where ˆ x t i = ˆ x 0 i + δ t i , for 1 ≤ t ≤ T . Each v ariant is assigned a tar get class uniformly sampled from { 1 , 2 , · · · , C } . T o position each variant near the inter - section of multiple decision boundaries, we define a biased target probability distribution p t i for each variant toward a randomly chosen target class ˆ y t i , and p t i ( y ) is the probabil- ity assigned to class y : p t i ( y ) = ( 1 C + τ , if y = ˆ y t i , 1 − ( 1 C + τ ) C − 1 , otherwise . (3) The perturbations { δ 1 i , δ 2 i , · · · , δ T i } are jointly optimized by minimizing the av erage KL div ergence between the model’ s output distribution and the biased tar - gets. The loss function is formulated as L phase2 = 1 T P T t =1 [ KL ( f o ( ˆ x t i ) ∥ p t i ) + λ 2 ∥ δ t i ∥ 1 ] , where λ 2 is a reg- ularization coefficient. This process yields a fingerprint set with N f composite-sample fingerprints {T i } N f i =1 , where T i = { ( ˆ x 0 i , ˆ y 0 i ) , ( ˆ x 1 i , ˆ y 1 i ) , . . . , ( ˆ x T i , ˆ y T i ) } . By applying minimal, model-specific perturbations to each seed, IrisFP captures subtle variations in the model’ s decision landscape. These perturbations are optimized to induce div erse responses, leveraging the fact that adver- sarial perturbations are highly sensitive to a model’ s de- cision boundaries. Since independently-trained models— ev en those with architectural similarity—often exhibit dif- ferent decision boundaries, their outputs across all samples in a fingerprint are typically inconsistent with those of the protected model, failing to replicate the behavioral pattern elicited by the protected model. This behavioral discrep- ancy enhances the discriminative capability of composite- sample fingerprints. 3.2.3. Phase III: Refining Fingerprints Some of the generated composite-sample fingerprints may not have strong and consistent discriminativ e behaviors, inevitably degrading ownership verification performance. Therefore, we propose a fingerprint refinement strategy comprising three steps: calculating the matching rate, selecting composite-sample fingerprints, and computing fingerprint-specific thresholds. Prior to describing these steps, we define two reference model sets used throughout the refinement process: • Pirated Model Set ( V f ): Models deriv ed from the pro- tected model via model modification techniques. • Independent Model Set ( I f ): Models trained indepen- dently from scratch using different initializations, archi- tectures, or training data. Both sets are used to ev aluate whether a composite-sample fingerprint elicits distinguishable responses between pirated and independently-trained models. Step 1: Calculating the matching rate. T o support fin- gerprint refinement, we first compute the fingerprint-le v el matching rate M R ( f , T i ) for each composite-sample fin- gerprint T i across models in both V f and I f . Giv en T i = { ( ˆ x 0 i , ˆ y 0 i ) , ( ˆ x 1 i , ˆ y 1 i ) , . . . , ( ˆ x T i , ˆ y T i ) } and a model f j within V f or I f , the fingerprint-lev el matching rate is defined as MR ( f j , T i ) = 1 T +1 P T t =0 I [ f j ( ˆ x t i ) = ˆ y t i ] , where I [ · ] is an indicator function that returns “1” if f j with input ˆ x t i outputs ˆ y t i , and “0” otherwise. This metric quantifies the degree to which the model’ s behavior aligns with the protected model. After computing the matching rate for each fingerprint- model pair , we aggregate these results across the models to compute statistics that describe each fingerprint’ s matching behavior on the pirated model set V f and the independent model set I f . For a given composite-sample fingerprint T i , we compute the mean matching rate µ V i and standard de- viation σ V i across all models in V f , and likewise obtain µ I i and σ I i for I f . These results are subsequently used to select fingerprints and compute the fingerprint-specific thresholds. Step 2: Selecting composite-sample fingerprints. T o refine the fingerprint set, we assess the discriminative capability of each fingerprint by quantifying the behav- ioral separation it induces between V f and I f . Thus, we compute Cohen’ s d effect size for each T i by d i = ( µ V i − µ I i ) / r 1 2   σ V i  2 +  σ I i  2  . Cohen’ s d effect size is employed to quantify the standardized separation be- tween the two matching rate distrib utions of pirated and independently-trained models, capturing both the magni- tude of behavioral dif ference and the statistical stability of that difference. A lar ger d i indicates that a fingerprint T i elicits more di ver gent beha viors between pirated and inde- pendent model sets, while maintaining higher consistency within each group, making it a stronger candidate for re- liable ownership v erification. Based on d i , we select the top- K fingerprints with the highest d i values to construct a final fingerprint set T ∗ = {T ∗ i } K i =1 . Step 3: Computing the finger print-specific thr eshold. T o account for the varying discriminativ e po wer of individ- ual fingerprints, we de velop a fingerprint-specific threshold- ing strategy that assigns a specific threshold to each finger - print. This design ensures that each fingerprint enforces a verification criterion consistent with its statistical separabil- ity between pirated and independently-trained models. The threshold defines the minimum matching rate required to assert a target model f t to be pirated. Each threshold θ i is computed based on the matching rate statistics of a finger - print T ∗ i across V f and I f . More specifically , θ i is calcu- lated as a weighted av erage of their means, with weights in v ersely proportional to their standard deviations: θ i =    µ V i /σ V i + µ I i /σ I i 1 /σ V i +1 /σ I i , if σ V i , σ I i > 0 , µ V i + µ I i 2 , otherwise . (4) This thresholding strategy assigns each fingerprint an indi- vidually optimized threshold, jointly considering the mean and standard deviation of matching-rate distributions to achiev e maximal discriminati ve capability . 3.3. The Ownership V erification Process T o verify the o wnership of a target model f t , we assess ho w it responds to all samples in each composite-sample finger- print within the fingerprint set T ∗ . T o this end, we design a tw o-step o wnership verification scheme, comprising o wn- ership matching and decision aggregation. The detailed de- sign is presented below . Step 1: Ownership matching. For each composite- sample fingerprint T ∗ i ∈ T ∗ , its matching rate is computed by M R ( f t , T ∗ i ) = 1 T +1 P T t =0 I [ f t ( ˆ x t i ) = ˆ y t i ] . Then, the resulting matching rate is compared with the correspond- ing fingerprint-specific threshold θ i , i.e., S M R ( f t , T ∗ i ) = I [ M R ( f t , T ∗ i ) ≥ θ i ] , where a result of “1” indicates that the target model’ s behavior on T ∗ i is consistent with the pro- tected model f , and “0” otherwise. Step 2: Decision aggregation. After obtaining the matching decision for each composite-sample fingerprint, a final ownership decision will be made. Since pirated models inherit decision beha vior from the protected model f , they are expected to match the majority of fingerprints, whereas independently-trained models tend to diver ge. W e aggre- gate these decisions to produce a final verification score by V S ( f t , T ∗ ) = 1 K P K i =1 S M R ( f t , T ∗ i ) , which quanti- fies the ov erall behavioral alignment of the target model f t with f . The target model is classified as pirated if V S ( f t , T ∗ ) ≥ α , where α is a predefined decision thresh- old; otherwise, it is deemed independently-trained. 4. Experiments 4.1. Experimental Setting T o ev aluate the effecti veness of IrisFP , we conduct com- prehensiv e experiments on different DNN models trained on five widely-used datasets: CIF AR-10 [ 11 ], CIF AR- 100 [ 12 ], Fashion-MNIST [ 28 ], MNIST [ 14 ], and Tin y- ImageNet [ 13 ]. W e compare IrisFP with four representa- tiv e fingerprinting methods: IPGuard [ 3 ], U AP [ 23 ], AD V - TRA [ 30 ], and AKH[ 6 ]. W e also ev aluate the robustness of IrisFP against model modification attacks by consider- ing six types of attacks, including fine-tuning (FT), prun- ing (PR), knowledge distillation (KD), adv ersarial training (A T), prune-then-tune (PFT), and noise injection-then-fine- tuning (NFT). T o ensur fair comparisons, we query 200 times in each experiment, i.e., 40 composite fingerprints with 5 samples each for our proposed IrisFP and 200 fin- gerprints for all baseline methods. In the follo wing, we will summarize protected models, reference model sets, testing model set, and ev aluation metrics, and their detailed settings can be found in Sec. 12 in the Supplementary Material. Protected Model. W e consider three different architec- tures, including ResNet-18, MobileNet-V2, and V iT -B/16. For ResNet-18 and MobileNet-V2, each protected model is trained from scratch across its corresponding different dataset, while for V iT -B/16, its protected model is fine- tuned from a pretrained V iT -B/16 model on T iny-ImageNet. Reference Model Sets. W e construct two reference model sets: one for pirated models and the other for independently-trained models. Specifically , for each pro- tected model, its corresponding pirated model set is con- T able 1. A UCs achieved by dif ferent fingerprinting approaches across datasets and protected model architectures. Protected Model Method CIF AR-10 CIF AR-100 Fashion-MNIST MNIST Tiny-ImageNet ResNet-18 IPGuard 0.675 ± 0.095 0.654 ± 0.097 0.721 ± 0.061 0.471 ± 0.089 0.726 ± 0.099 U AP 0.732 ± 0.014 0.761 ± 0.049 0.721 ± 0.061 0.789 ± 0.036 0.812 ± 0.045 AD V -TRA 0.799 ± 0.003 0.806 ± 0.005 0.845 ± 0.010 0.753 ± 0.019 0.767 ± 0.073 AKH 0.710 ± 0.052 0.785 ± 0.086 0.765 ± 0.042 0.820 ± 0.026 0.823 ± 0.043 IrisFP 0.893 ± 0.015 0.916 ± 0.009 0.940 ± 0.031 0.854 ± 0.024 0.874 ± 0.052 MobileNet-V2 IPGuard 0.821 ± 0.047 0.823 ± 0.021 0.607 ± 0.035 0.634 ± 0.010 0.692 ± 0.019 U AP 0.749 ± 0.028 0.836 ± 0.042 0.816 ± 0.019 0.743 ± 0.021 0.806 ± 0.039 AD V -TRA 0.824 ± 0.051 0.795 ± 0.011 0.782 ± 0.028 0.720 ± 0.023 0.877 ± 0.045 AKH 0.860 ± 0.067 0.867 ± 0.044 0.797 ± 0.054 0.805 ± 0.019 0.863 ± 0.034 IrisFP 0.936 ± 0.011 0.937 ± 0.017 0.963 ± 0.017 0.876 ± 0.015 0.934 ± 0.023 V iT -B/16 IPGuard – – – – 0.778 ± 0.029 U AP – – – – 0.803 ± 0.042 AD V -TRA – – – – 0.832 ± 0.019 AKH – – – – 0.806 ± 0.010 IrisFP – – – – 0.887 ± 0.036 structed by modifying the protected model through three model modification techniques, including fine-tuning (FT), knowledge distillation (KD), and adversarial training (A T). Each attack type produces three pirated variants, leading to 9 pirated variants in the pirated model set. On the other hand, our independent model set is constructed by using three architectures: ResNet-18, MobileNet-V2, and DenseNet-121. The models in this set are independently trained from scratch with dif ferent seeds and hyperparame- ters from training the protected models and testing models. Each architecture will be instantiated with 3 different ran- dom seeds to create 3 independently-trained models, result- ing in a total of 9 models in the independent model set. T esting Model Set. W e e valuate IrisFP using a test- ing model set comprising both pirated and independently- trained models. The pirated models are deriv ed by apply- ing six attacks (i.e., FT , PR, KD, A T , PFT , and NFT) to each protected model. For each attack type, we generate 20 variants using dif ferent random seeds and hyperparame- ters, resulting in 120 pirated models. The independently- trained models are built from scratch using six architec- tures: ResNet-18, ResNet-50, MobileNet-V2, MobileNet- V3 Large, EfficientNet-B2, and DenseNet-121. For each architecture, 20 models are trained with distinct seeds and hyperparameters, without any access to the protected mod- els or the models in the reference model set, yielding 120 independently-trained models. Metrics. W e use the Area Under the R OC (Receiv er Operating Characteristic) Curve (A UC) as the main metric to e v aluate the performance of our proposed fingerprinting method. A UC quantifies the probability that a randomly se- lected pirated model recei ves a higher matching score than a randomly selected independently-trained model. Higher A UC values reflect stronger discriminative capability . All A UC results are reported as the mean and standard devia- tion ov er fi ve independent runs. 4.2. Main Perf ormance 4.2.1. The Effectiveness of IrisFP T able 1 shows the superior effecti veness of IrisFP by re- porting the A UCs achiev ed by IrisFP and four represen- tativ e baselines (IPGuard, U AP , ADV -TRA, and AKH) across fi ve benchmark datasets—CIF AR-10, CIF AR-100, Fashion-MNIST , MNIST , and Tin y-ImageNet. T o ensure a comprehensiv e e valuation, we train two protected models using ResNet-18 and MobileNet-V2 architectures on each of the fi ve datasets. In addition, we incorporate a lar ger and more complex architecture by training a protected model using V iT -B/16 on Tin y-ImageNet. From the table, we can observe that IrisFP consistently achiev es the highest A UC across the models and datasets. For example, on CIF AR- 100, IrisFP achie ves an A UC of 0.916 when fingerprint- ing ResNet-18 and an A UC of 0.937 when fingerprinting MobileNet-V2, achieving the improvements by 13.7% and 8.1%, respecti vely , over the best-performing baselines. It is notew orthy that IrisFP maintains the highest A UC of 0.887 when fingerprinting V iT -B/16 trained on Tin y-ImageNet, demonstrating that IrisFP can achiev e reliable ownership verification on lar ger and more complex models. (b) (a) Figure 4. (a) The ROC curve and (b) the distribution of v erification score on the protected model with ResNet-18 architecture trained on Fashion-MNIST . T o further v alidate IrisFP’ s discriminative power , Fig- ure 4 visualizes the ROC curves and the distribution of verification scores. These plots provide intuitiv e insights T able 2. A UCs under six model modification attacks across datasets and methods on the protected model with ResNet-18 architecture. Dataset Method FT PR KD A T PFT NFT CIF AR-10 IPGuard 0.656 ± 0.101 0.997 ± 0.005 0.515 ± 0.082 0.511 ± 0.265 0.687 ± 0.144 0.724 ± 0.142 U AP 0.809 ± 0.011 0.998 ± 0.005 0.722 ± 0.013 0.346 ± 0.017 0.856 ± 0.016 0.868 ± 0.018 AD V -TRA 1.000 ± 0.000 1.000 ± 0.000 0.805 ± 0.009 0.025 ± 0.006 0.959 ± 0.012 0.962 ± 0.009 AKH 0.921 ± 0.005 0.876 ± 0.020 0.621 ± 0.042 0.531 ± 0.108 0.701 ± 0.029 0.733 ± 0.066 IrisFP 0.954 ± 0.046 1.000 ± 0.000 0.616 ± 0.068 0.929 ± 0.042 0.965 ± 0.009 0.968 ± 0.014 CIF AR-100 IPGuard 0.617 ± 0.137 0.998 ± 0.003 0.593 ± 0.141 0.621 ± 0.106 0.576 ± 0.104 0.584 ± 0.129 U AP 0.764 ± 0.069 0.900 ± 0.002 0.709 ± 0.071 0.677 ± 0.053 0.752 ± 0.052 0.770 ± 0.065 AD V -TRA 0.863 ± 0.010 1.000 ± 0.000 0.899 ± 0.007 0.338 ± 0.008 0.864 ± 0.014 0.863 ± 0.012 AKH 0.813 ± 0.076 0.670 ± 0.185 0.858 ± 0.061 0.221 ± 0.076 0.671 ± 0.050 0.775 ± 0.090 IrisFP 0.953 ± 0.011 1.000 ± 0.000 0.927 ± 0.006 0.758 ± 0.031 0.933 ± 0.009 0.928 ± 0.019 Fashion-MNIST IPGuard 0.688 ± 0.139 0.990 ± 0.001 0.592 ± 0.096 0.612 ± 0.115 0.632 ± 0.187 0.636 ± 0.216 U AP 0.809 ± 0.055 0.869 ± 0.045 0.749 ± 0.012 0.716 ± 0.018 0.797 ± 0.061 0.781 ± 0.045 AD V -TRA 0.980 ± 0.000 0.990 ± 0.000 0.830 ± 0.004 0.158 ± 0.008 0.981 ± 0.001 0.980 ± 0.000 AKH 0.855 ± 0.118 0.990 ± 0.000 0.844 ± 0.006 0.203 ± 0.038 0.759 ± 0.078 0.737 ± 0.091 IrisFP 0.982 ± 0.002 0.992 ± 0.000 0.853 ± 0.037 0.816 ± 0.190 0.982 ± 0.003 0.983 ± 0.002 MNIST IPGuard 0.378 ± 0.022 0.733 ± 0.030 0.624 ± 0.108 0.067 ± 0.065 0.432 ± 0.045 0.456 ± 0.122 U AP 0.966 ± 0.009 0.630 ± 0.011 0.455 ± 0.054 0.946 ± 0.088 0.892 ± 0.007 0.959 ± 0.005 AD V -TRA 0.965 ± 0.002 0.753 ± 0.028 0.485 ± 0.023 0.475 ± 0.019 0.975 ± 0.003 0.976 ± 0.001 AKH 0.980 ± 0.000 0.797 ± 0.013 0.745 ± 0.011 0.837 ± 0.069 0.915 ± 0.010 0.920 ± 0.030 IrisFP 0.985 ± 0.001 0.966 ± 0.007 0.467 ± 0.091 0.704 ± 0.025 0.983 ± 0.002 0.985 ± 0.001 T iny-ImageNet IPGuard 0.958 ± 0.044 0.754 ± 0.129 0.525 ± 0.136 0.316 ± 0.118 0.867 ± 0.133 0.954 ± 0.034 U AP 0.953 ± 0.038 0.981 ± 0.001 0.455 ± 0.089 0.411 ± 0.114 0.967 ± 0.012 0.952 ± 0.016 AD V -TRA 0.970 ± 0.040 0.917 ± 0.056 0.466 ± 0.069 0.173 ± 0.143 0.980 ± 0.064 0.948 ± 0.036 AKH 0.937 ± 0.040 0.875 ± 0.019 0.326 ± 0.055 0.488 ± 0.099 0.988 ± 0.024 0.944 ± 0.051 IrisFP 0.986 ± 0.032 1.000 ± 0.000 0.661 ± 0.108 0.545 ± 0.106 0.992 ± 0.015 0.977 ± 0.027 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA AKH IrisFP 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV-TRA AKH IrisFP 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA AKH IrisFP 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV-TRA AKH IrisFP 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA AKH IrisFP 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV-TRA AKH IrisFP 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA AKH IrisFP 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV-TRA AKH IrisFP 1%FNR 2%FNR 5%FNR 10%FNR 0.0 0.2 0.4 0.6 0.8 1.0 TNR IPGuard UAP ADV-TRA AKH IrisFP 1%FPR 2%FPR 5%FPR 10%FPR 0.0 0.2 0.4 0.6 0.8 1.0 TPR IPGuard UAP ADV -TRA AKH IrisFP (a) (c) (b) (d) (e) (f) (g) (h) (i) (j) Figure 5. TNR-FNR and TPR-FPR curv es on the protected models with ResNet-18 architecture trained on CIF AR-10 ((a) and (f)), CIF AR- 100 ((b) and (g)), Fashion-MNIST ((c) and (h)), MNIST ((d) and (i)), and T iny-ImageNet ((e) and (j)). into ho w well each method distinguishes pirated models from independently-trained ones. Figure 4 (a) sho ws the R OC curves for all fiv e methods, illustrating the trade-off between TPR and FPR. IrisFP clearly demonstrates the strongest discriminati v e capability , with its curve close to the top-left corner , outperforming all baselines. In contrast, IPGuard’ s and U AP’ s curves lie near the diagonal, indicat- ing weak uniqueness achiev ed by them, while AD V -TRA and AKH perform better but still fall short of IrisFP . Fig- ure 4 (b) presents two distributions of fingerprint verifica- tion scores achieved by each fingerprinting method under both pirated and independently-trained models. A wider separation between the two distributions reflects a stronger discriminativ e capability . As shown in this figure, IrisFP achiev es a clear separation, while baselines exhibit greater distributional overlap, reflecting their limited ability to dif- ferentiate pirated models from independently-trained ones. 4.2.2. Evaluation of IrisFP’ s Uniqueness and Rob ustness Figure 5 demonstrates the uniqueness and robustness achiev ed by IrisFP . The e v aluation is performed under vary- ing FNR and FPR le vels, specifically 1%, 2%, 5%, and 10%. W e can observe from the figure that IrisFP consis- tently outperforms all baselines, achie ving the highest TNR (as shown by (a)-(e)) and the highest TPR (as shown by (f)-(j)), thereby demonstrating superior uniqueness and ro- bustness. Among the baselines, AD V -TRA exhibits strong robustness—achie ving high TPR by effecti vely identify- ing pirated models—but suffers from low TNR, indicat- ing poor uniqueness due to frequent misidentification of independently-trained models. In contrast, U AP maintains relativ ely stable TNR, indicating better uniqueness, but struggles with lo wer TPR, especially on F ashion-MNIST , rev ealing limited robustness. Moreover , AKH sho ws in- consistent behavior , performing well in either rob ustness or uniqueness b ut not both. Additionally , IPGuard performs the worst, failing to reliably distinguish pirated models from independently-trained ones under an y setting. These results demonstrate that IrisFP is the only method achieving high robustness and uniqueness simultaneously , making it a reli- able model fingerprinting solution. 4.2.3. Robustness to Model Modification Attacks T able 2 shows the effecti veness of IrisFP against model modifications that aim to remove or distort the original fin- gerprints. Specifically , we compare IrisFP with four repre- sentativ e baselines using the protected model with ResNet- 18 architecture trained across fi ve datasets—CIF AR- 10, CIF AR-100, Fashion-MNIST , MNIST , and T iny- ImageNet—under six representative attacks: FT , PR, KD, A T , NFT , and PFT . The comparison is based on the A UCs achiev ed under each setting. W e can observe that IrisFP consistently achie ves the best performance under CIF AR- 100, Fashion-MNIST , and T in y-ImageNet across all six at- tacks. For instance, IrisFP attains A UCs of 0.953 (FT), 0.927 (KD), and 0.758 (A T) for CIF AR-100, and 0.982 (FT), 0.853 (KD), and 0.816 (A T) for Fashion-MNIST , substantially outperforming the baselines. For CIF AR-10, IrisFP achieves the highest A UCs under four attacks (PR, A T , PFT , NFT) and slightly lo wer A UC under FT and KD. These observations demonstrate IrisFP’ s strong robustness to a wide range of model modification attacks. CIF AR-10 CIF AR-100 Fashion-MNIST MNIST Tiny-ImageNet 0.7 0.8 0.9 1.0 AUC Seed Seed_s Com_t Com_ft Com_s_ft IrisFP Figure 6. A UCs under different fingerprint configurations on the protected model with ResNet-18 architecture. 4.3. Ablation Studies W e conduct ablation studies to ev aluate the effecti veness of each component in IrisFP , including fingerprint seed ini- tialization, composite-sample fingerprint generation, finger- print selection, and fingerprint-specific threshold: • Seed: Directly using initial fingerprint seeds without fur - ther processing. • Seed s: Applying fingerprint selection to fingerprint seeds. • Com ft: Extending fingerprint seeds into composite- sample fingerprints and using a fixed global threshold. • Com s ft: Extending fingerprint seeds into composite- sample fingerprints, using a fixed global threshold, and applying fingerprint selection. • Com t: Extending fingerprint seeds into composite- sample fingerprints and applying fingerprint-specific thresholds, without fingerprint selection. Figure 6 illustrates the A UCs across fiv e datasets under various configurations. Specifically , Seed alone yields the worst verification performance across all datasets, as it only reflects the initial fingerprinting step without other com- ponents. Seed s leads to a clear impro vement (e.g., from 0.691 to 0.748 on CIF AR-10), demonstrating the neces- sity of retaining only the most discriminative fingerprints. Furthermore, extending fingerprint seeds into composite- sample fingerprints (Com ft) enhances uniqueness by cap- turing richer decision-re gion behavior through class-diverse outputs. Building upon this, applying fingerprint selection (Com s ft) further strengthens discriminativ e capability by filtering out less effecti ve fingerprints. More importantly , using fingerprint-specific thresholds also results in perfor- mance improv ement, for example, improving the A UC from 0.812 (Com s ft) to 0.893 (IrisFP) on CIF AR-10, demon- strating t he adv antage of adapting to a specific threshold for each composite-sample fingerprint. In addition, IrisFP con- sistently outperforms all ablated variants across datasets. These results highlight the complementary strengths of each component and the necessity of integrating all of them in our design for reliable ownership v erification. Further hyperparameter analysis and additional experi- mental results are presented in the supplementary material. 5. Conclusions This paper studies the problem of enhancing both unique- ness and robustness for adversarial-example-based model fingerprinting. T o address this issue, we propose IrisFP , a new fingerprinting method that lev erages multi-boundary characteristics, multi-sample beha vior , and fingerprint qual- ity assessment. Specifically , IrisFP crafts fingerprint seeds positioned near the intersection of multi-decision bound- aries to increase prediction margins. It then applies subtle and diverse perturbations to each seed to generate multi- ple variants, forming a composite-sample fingerprint. The resulting fingerprints are refined using a discriminativ e se- lection strategy to retain only those with the strongest sep- arability . For o wnership verification, IrisFP emplo ys a two- step process comprising o wnership matching and decision aggregation, tailored to the structure of composite-sample fingerprints. Extensiv e experiments validate the ef fecti ve- ness of IrisFP , with results consistently showing superior performance ov er state-of-the-art methods. References [1] Xiaofan Bai, Chaoxiang He, Xiaojing Ma, Bin Benjamin Zhu, and Hai Jin. Intersecting-boundary-sensitive finger- printing for tampering detection of DNN models. In Pr o- ceedings of F orty-first International Confer ence on Machine Learning (ICML ’24) , 2024. 2 [2] Xiaofan Bai, Shixin Li, Xiaojing Ma, Bin Benjamin Zhu, Dongmei Zhang, and Linchen Y u. SDBF: Steep-decision- boundary fingerprinting for hard-label tampering detection of DNN models. In Pr oceedings of the Computer V ision and P attern Recognition Conference (CVPR’25) , pages 29278– 29287, 2025. 2 [3] Xiaoyu Cao, Jinyuan Jia, and Neil Zhenqiang Gong. IP- Guard: Protecting intellectual property of deep neural net- works via fingerprinting the classification boundary . In Pr o- ceedings of the 2021 A CM Asia Conference on Computer and Communications Security (AsiaCCS’21) , 2021. 1 , 5 [4] Jialuo Chen, Jingyi W ang, T inglan Peng, Y oucheng Sun, Peng Cheng, Shouling Ji, Xingjun Ma, Bo Li, and Dawn Xi- aodong Song. Copy , Right? A testing frame work for cop y- right protection of deep learning models. In Pr oceedings of IEEE Symposium on Security and Privacy (S&P’22) , pages 824–841, 2022. 1 [5] Y ufei Chen, Chao Shen, Cong W ang, and Y ang Zhang. T eacher model fingerprinting attacks against transfer learn- ing. In Pr oceedings of 31st USENIX Security Symposium (USESec’22) , pages 3593–3610, Boston, MA, 2022. 1 , 5 [6] Augustin Godinot, Erwan Le Merrer , Camilla Penzo, Franc ¸ ois T a ¨ ıani, and Gilles Tr ´ edan. Queries, representation & detection: The next 100 model fingerprinting schemes. In Pr oceedings of the AAAI Conference on Artificial Intelli- gence (AAAI’25) , pages 16817–16825, 2025. 1 , 5 , 6 [7] Jiyang Guan, Jian Liang, and Ran He. Are you stealing my model? sample correlation for fingerprinting deep neural networks. In Pr oceedings of Advances in Neural Information Pr ocessing Systems (NeurIPS’22) , 2022. 5 [8] Song Han, Huizi Mao, and W illiam J Dally . Deep com- pression: Compressing deep neural networks with pruning, trained quantization and huf fman coding. In ICLR’16 , 2016. 3 [9] Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. High accuracy and high fi- delity extraction of neural networks. In 29th USENIX secu- rity symposium (USESec’20) , pages 1345–1362, 2020. 1 [10] Hengrui Jia, Hongyu Chen, Jonas Guan, Ali Shahin Shamsabadi, and Nicolas Papernot. A zest of lime: T owards architecture-independent model distances. In International Confer ence on Learning Repr esentations (ICLR’21) , 2021. 5 [11] Alex Krizhe vsky , V inod Nair , Geoffre y Hinton, et al. The CIF AR-10 dataset. online: http://www . cs. tor onto. edu/kriz/cifar . html , 55(5):2, 2014. 5 [12] Alex Krizhe vsky , V inod Nair , Geoffre y Hinton, et al. The CIF AR-100 dataset. online: http://www . cs. tor onto. edu/kriz/cifar . html , 55(5):2, 2014. 5 [13] Y a Le and Xuan S. Y ang. Tin y imagenet visual recognition challenge. 2015. 5 [14] Y ann LeCun, Corinna Cortes, and CJ Burges. MNIST handwritten digit database. ATT Labs [Online]. A vailable: http://yann.lecun.com/exdb/mnist , 2, 2010. 5 [15] Fangqi Li, Shilin W ang, and Lei Y ang. Rethinking the fragility and rob ustness of fingerprints of deep neural net- works. In Pr oceedings of IEEE International Confer ence on Acoustics, Speech and Signal Pr ocessing (ICASSP’25) , pages 1–5. IEEE, 2025. 1 [16] Hao Li, Asim Kadav , Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient con vnets. In ICLR’17 , 2017. 3 [17] Honglin Li, Chenglu Zhu, Y unlong Zhang, Y uxuan Sun, Zhongyi Shui, W enwei Kuang, Sunyi Zheng, and Lin Y ang. T ask-specific fine-tuning via variational information bottle- neck for weakly-supervised pathology whole slide image classification. In Proceedings of the IEEE/CVF Confer ence on Computer V ision and P attern Recognition (CVPR’23) , pages 7454–7463, 2023. 3 [18] Jiacheng Liang, Ren Pang, Changjiang Li, and T ing W ang. Model extraction attacks revisited. In Proceedings of the 19th A CM Asia Conference on Computer and Communica- tions Security (AsiaCCS’24) , pages 1231–1245, 2024. 1 [19] Zhangting Lin, Mingfu Xue, Ke wei Chen, W enmao Liu, Xiang Gao, Leo Y u Zhang, Jian W ang, and Y ushu Zhang. Adversarial example based fingerprinting for ro- bust copyright protection in split learning. arXiv preprint arXiv:2503.04825 , 2025. 1 [20] W eixing Liu and Shenghua Zhong. MarginFinger: Control- ling generated fingerprint distance to classification bound- ary using conditional gans. In Pr oceedings of the 2024 In- ternational Confer ence on Multimedia Retrieval (ICMR’24) , 2024. 1 , 6 [21] Nils Lukas, Y uxuan Zhang, and Florian Kerschbaum. Deep neural network fingerprinting by conferrable adversarial ex- amples. In International Confer ence on Learning Represen- tations (ICLR’21) , 2021. 1 , 5 , 6 [22] Thibault Maho, T eddy Furon, and Erw an Le Merrer . Finger- printing classifiers with benign inputs. IEEE T ransactions on Information F or ensics and Security (TIFS) , 18:5459–5472, 2023. 5 [23] Zirui Peng, Shaofeng Li, Guoxing Chen, Cheng Zhang, Hao- jin Zhu, and Minhui Xue. Fingerprinting deep neural net- works globally via uni versal adversarial perturbations. In Pr oceedings of 2022 IEEE/CVF Confer ence on Computer V ision and P attern Recognition (CVPR’22) , pages 13420– 13429, 2022. 1 , 5 , 6 [24] Ali Shafahi, Mahyar Najibi, Mohammad Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer , Larry S Davis, Gavin T aylor, and T om Goldstein. Adversarial training for free! Advances in Neural Information Pr ocessing Systems (NeurIPS’19) , 32, 2019. 3 [25] Shuo Shao, Haozhe Zhu, Y iming Li, Hongwei Y ao, Tian- wei Zhang, and Zhan Qin. Fit-print: T owards false-claim- resistant model ownership verification via targeted finger- print. arXiv preprint , 2025. 1 [26] Gowthami Somepalli, Liam Fo wl, Arpit Bansal, Ping Y eh- Chiang, Y ehuda Dar , Richard Baraniuk, Micah Goldblum, and T om Goldstein. Can neural nets learn the same model twice? in vestigating reproducibility and double descent from the decision boundary perspectiv e. In Pr oceedings of the IEEE/CVF Confer ence on Computer V ision and P attern Recognition (CVPR’22) , pages 13699–13708, 2022. 5 [27] Y uzheng W ang, Zhaoyu Chen, Dingkang Y ang, Pinxue Guo, Kaixun Jiang, W enqiang Zhang, and Lizhe Qi. Out of thin air: Exploring data-free adversarial robustness distillation. In Pr oceedings of the AAAI Conference on Artificial Intelli- gence (AAAI’24) , pages 5776–5784, 2024. 3 [28] Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion- mnist: a novel image dataset for benchmarking machine learning algorithms. ArXiv , abs/1708.07747, 2017. 5 [29] Tianlong Xu, Chen W ang, Gaoyang Liu, Y ang Y ang, Kai Peng, and W ei Liu. United we stand, divided we fall: Fin- gerprinting deep neural networks via adv ersarial trajectories. In Pr oceedings of Neural Information Pr ocessing Systems (NeurIPS’24) , 2024. 6 [30] Tianlong Xu, Chen W ang, Gaoyang Liu, Y ang Y ang, Kai Peng, and W ei Liu. United we stand, divided we fall: Fin- gerprinting deep neural networks via adv ersarial trajectories. In The Thirty-eighth Annual Confer ence on Neur al Informa- tion Pr ocessing Systems (NeurIPS’24) , 2024. 1 , 5 [31] Anli Y an, Huali Ren, Kanghua Mo, Zhenxin Zhang, Shao wei W ang, and Jin Li. Enhancing model intellectual property protection with robustness fingerprint technology . IEEE T ransactions on Information F or ensics and Security (TIFS) , 2025. 5 [32] Guang Y ang, Ziye Geng, Y ihang Chen, and Changqing Luo. Fingerprinting deep neural networks for ownership protec- tion: An analytical approach. In The F ourteenth Interna- tional Conference on Learning Representations (ICLR’26) , 2026. 1 [33] Guang Y ang, Ziye Geng, Y ihang Chen, and Changqing Luo. Liteguard: Efficient task-agnostic model fingerprinting with enhanced generalization. In The F ourteenth International Confer ence on Learning Repr esentations (ICLR’26) , 2026. 1 [34] Kan Y ang and Kunhao Lai. NaturalFinger: Generating nat- ural fingerprint with generative adversarial networks. arXiv pr eprint arXiv:2305.17868 , 2023. 6 [35] Xiaoyu Y ou, Y ouhe Jiang, Jianwei Xu, Mi Zhang, and Min Y ang. GNNFingers: A fingerprinting framework for verify- ing ownerships of graph neural networks. In Pr oceedings of the A CM on W eb Conference (WWW’24) , 2024. 5 [36] Zhuomeng Zhang, Fangqi Li, Hanyi W ang, and Shi-Lin W ang. Boosting the uniqueness of neural networks fin- gerprints with informativ e triggers. In Pr oceedings of The Thirty-ninth Annual Confer ence on Neural Information Pr o- cessing Systems (NeurIPS’25) , 2025. 5 [37] Boyao Zhao, Haozhe Chen, Jie Zhang, W eiming Zhang, and Nenghai Y u. Dual-verification-based model fingerprints against ambiguity attacks. Cybersecurity , 7(1):78, 2024. 1 , 5 [38] Jingjing Zhao, Qingyue Hu, Gaoyang Liu, Xiaoqiang Ma, Fei Chen, and Mohammad Mehedi Hassan. AF A: Adver- sarial fingerprinting authentication for deep neural networks. Computer Communications , 150:488–497, 2020. 1 , 5 [39] Y ue Zheng, Si W ang, and Chip-Hong Chang. A dnn finger- print for non-repudiable model ownership identification and piracy detection. IEEE T ransactions on Information F oren- sics and Security (TIFS) , 17:2977–2989, 2022. 5 [40] Xiaoling Zhou, W ei Y e, Zhemg Lee, Rui Xie, and Shikun Zhang. Boosting model resilience via implicit adversarial data augmentation. arXiv preprint , 2024. 3 [41] Fuzhen Zhuang, Zhiyuan Qi, K eyu Duan, Dongbo Xi, Y ongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. A comprehensiv e survey on transfer learning. Pr oceedings of the IEEE , 109(1):43–76, 2020. 3 [42] W ei Zong, Y ang-W ai Chow , W illy Susilo, Joonsang Baek, Jongkil Kim, and Seyit Camtepe. IPRemover: A generativ e model inv ersion attack against deep neural network finger- printing and watermarking. In Pr oceedings of the AAAI Con- fer ence on Artificial Intelligence (AAAI’24) , pages 7837– 7845, 2024. 3 IrisFP: Adversarial-Example-based Model Finger printing with Enhanced Uniqueness and Rob ustness Supplementary Material 6. Threat Model W e consider a typical model fingerprinting scenario in volv- ing two entities: a model owner and an adversary . Specif- ically , the model owner first produces a protected model f and then e xploits adversarial example techniques to gener - ate a set of fingerprints, which are kept confidential, to pro- tect the IP of f . On the other hand, the adversary first ob- tains an unauthorized copy of f , for example, via white-box access or black-box extraction, and subsequently modifies it using model modification techniques, such as fine-tuning (FT), pruning (PR), adversarial training (A T), knowledge distillation (KD), etc., to produce a variant f p . The adver- sary then deploys f p in a black-box setting, making it ac- cessible to the public through APIs. Suppose that the model owner identifies a black-box model f t as a suspicious target. The o wner’ s goal is to determine whether f t is deriv ed from the protected model f . T o perform o wnership verification, the o wner queries f t using the generated fingerprints and compares the returned outputs with the expected fingerprint outputs. If the number of matches exceeds a predefined threshold, the target model is deemed to be an infringing model. 7. Additional Results T ables 3 and 4 present additional results on the robustness of the protected models using the MobileNet-V2 and V iT - B/16 architectures, respectively . For MobileNet-V2, we ev aluate IrisFP against four representati v e baselines across fiv e datasets—CIF AR-10, CIF AR-100, Fashion-MNIST , MNIST , and Tin y-ImageNet—under six model modifica- tion attacks: FT , PR, KD, A T , NFT , and PFT . IrisFP remains consistently robust across datasets, achieving the highest A UCs on CIF AR-100, Fashion-MNIST , and MNIST under all attacks, and leading on T iny-ImageNet for fiv e out of six attacks (FT , KD, A T , PFT , NFT), with only a slight drop on PR. On CIF AR-10, IrisFP attains the highest A UCs un- der PR, A T , and PFT , while remaining competiti ve on the remaining attacks. For the protected model with V iT -B/16 architecture trained on T iny-ImageNet, IrisFP achiev es the highest A UCs across all six attacks, demonstrating strong robustness on transformer -based architectures as well. T o- gether with the results on ResNet-18 in the main text (i.e., T able 2 ), these findings confirm that IrisFP maintains high robustness against model modification attacks across di- verse model architectures—lightweight and high-capacity con v olutional models as well as modern vision transform- ers. 8. Impact of Hyperparameters T o ensure a comprehensiv e ev aluation, we consider three protected models in the following experiments: a ResNet- 18 trained on CIF AR-100, a MobileNet-V2 trained on Fashion-MNIST , and a V iT -B/16 trained on Tin y-ImageNet. 8.1. Impact of combination of K and T W e examine how the allocation between the number of fin- gerprints K and the number of elements T per fingerprint affects the verification performance when the total number of queries K × T is fix ed at 200. As shown in Figure 7 , across three protected models (ResNet-18 on CIF AR-100, MobileNet-V2 on Fashion-MNIST , and V iT -B/16 on T iny- ImageNet), the A UC varies notably across different ( K, T ) combinations e ven though the total query b udget remains constant. Specifically , the performance improv es steadily as K increases from 10 to 40 while T decreases from 20 to 5, and then drops when K further increases to 100 with T = 2 . For example, for ResNet-18 on CIF AR-100, the A UC rises from 0.842 at (10 , 20) to 0.905 at (20 , 10) and peaks at 0.916 for (40 , 5) , before declining to 0.872 at (100 , 2) . Similar trends are observed for MobileNet-V2 on Fashion- MNIST and V iT -B/16 on Tin y-ImageNet. Overall, config- urations with larger K and smaller T consistently outper - form those with smaller K and larger T under the same total query budget. The configuration ( K = 40 , T = 5) achiev es the highest A UC across all settings under the fix ed total query budget. This is because a lar ger K allows the verification process to rely on a broader set of composite- sample fingerprints, which enhances the ov erall statistical reliability of the verification results. CIF AR-100 Fashion-MNIST Tiny-ImageNet 0.7 0.8 0.9 1.0 AUC K=10, T=20 K=20, T=10 K=40, T=5 K=100, T=2 Figure 7. A UCs under different fingerprint numbers (K) and sam- ple counts (T). T able 3. A UCs under six attacks across datasets and methods on the protected model with MobileNet-V2 architecture. Dataset Method FT PR KD A T PFT NFT CIF AR-10 IPGuard 0.951 ± 0.022 0.994 ± 0.000 0.622 ± 0.048 0.481 ± 0.127 0.942 ± 0.029 0.934 ± 0.061 U AP 0.891 ± 0.011 0.915 ± 0.003 0.526 ± 0.052 0.412 ± 0.098 0.892 ± 0.031 0.880 ± 0.009 AD V -TRA 0.992 ± 0.025 0.995 ± 0.001 0.780 ± 0.068 0.214 ± 0.045 0.992 ± 0.040 0.992 ± 0.063 AKH 0.896 ± 0.034 0.904 ± 0.010 0.885 ± 0.071 0.727 ± 0.052 0.935 ± 0.073 0.921 ± 0.084 IrisFP 0.981 ± 0.006 0.997 ± 0.000 0.712 ± 0.055 0.978 ± 0.015 0.995 ± 0.003 0.982 ± 0.005 CIF AR-100 IPGuard 0.981 ± 0.002 0.983 ± 0.005 0.741 ± 0.036 0.208 ± 0.087 0.982 ± 0.000 0.982 ± 0.001 U AP 0.979 ± 0.010 0.987 ± 0.003 0.774 ± 0.079 0.723 ± 0.036 0.904 ± 0.056 0.929 ± 0.014 AD V -TRA 0.963 ± 0.012 0.981 ± 0.001 0.701 ± 0.027 0.632 ± 0.018 0.885 ± 0.003 0.932 ± 0.009 AKH 0.958 ± 0.007 0.979 ± 0.028 0.783 ± 0.068 0.656 ± 0.075 0.951 ± 0.057 0.923 ± 0.005 IrisFP 0.983 ± 0.002 0.991 ± 0.001 0.811 ± 0.051 0.886 ± 0.055 0.983 ± 0.000 0.983 ± 0.000 Fashion-MNIST IPGuard 0.587 ± 0.027 0.945 ± 0.046 0.518 ± 0.118 0.550 ± 0.090 0.555 ± 0.016 0.543 ± 0.103 U AP 0.897 ± 0.008 0.977 ± 0.007 0.658 ± 0.076 0.735 ± 0.031 0.971 ± 0.005 0.886 ± 0.013 AD V -TRA 0.963 ± 0.032 0.964 ± 0.012 0.721 ± 0.058 0.871 ± 0.036 0.898 ± 0.007 0.723 ± 0.037 AKH 0.879 ± 0.051 0.884 ± 0.038 0.821 ± 0.079 0.832 ± 0.040 0.855 ± 0.078 0.796 ± 0.016 IrisFP 0.975 ± 0.012 0.991 ± 0.000 0.860 ± 0.109 0.960 ± 0.017 0.979 ± 0.006 0.978 ± 0.010 MNIST IPGuard 0.690 ± 0.007 0.710 ± 0.030 0.656 ± 0.020 0.560 ± 0.021 0.535 ± 0.003 0.757 ± 0.018 U AP 0.767 ± 0.058 0.910 ± 0.000 0.668 ± 0.057 0.544 ± 0.011 0.672 ± 0.060 0.693 ± 0.022 AD V -TRA 0.821 ± 0.039 0.896 ± 0.013 0.642 ± 0.033 0.557 ± 0.032 0.634 ± 0.014 0.703 ± 0.021 AKH 0.871 ± 0.011 0.921 ± 0.015 0.569 ± 0.023 0.540 ± 0.052 0.885 ± 0.021 0.836 ± 0.009 IrisFP 0.983 ± 0.000 0.960 ± 0.014 0.683 ± 0.041 0.577 ± 0.107 0.981 ± 0.001 0.981 ± 0.003 T iny-ImageNet IPGuard 0.466 ± 0.086 0.996 ± 0.005 0.786 ± 0.006 0.538 ± 0.237 0.478 ± 0.074 0.475 ± 0.056 U AP 0.896 ± 0.014 0.985 ± 0.003 0.799 ± 0.035 0.633 ± 0.097 0.872 ± 0.036 0.843 ± 0.038 AD V -TRA 0.923 ± 0.013 0.932 ± 0.011 0.805 ± 0.077 0.836 ± 0.040 0.840 ± 0.087 0.848 ± 0.029 AKH 0.850 ± 0.037 0.901 ± 0.023 0.820 ± 0.052 0.809 ± 0.014 0.843 ± 0.038 0.859 ± 0.024 IrisFP 0.979 ± 0.011 0.956 ± 0.043 0.883 ± 0.096 0.863 ± 0.160 0.995 ± 0.001 0.963 ± 0.009 T able 4. A UCs under six model modification attacks on the protected model with V iT -B/16 architecture. Dataset Method FT PR KD A T PFT NFT T iny-ImageNet IPGuard 0.923 ± 0.006 0.692 ± 0.041 0.585 ± 0.153 0.606 ± 0.239 0.864 ± 0.036 0.901 ± 0.012 U AP 0.979 ± 0.011 0.705 ± 0.030 0.593 ± 0.086 0.762 ± 0.071 0.731 ± 0.045 0.852 ± 0.022 AD V -TRA 0.973 ± 0.022 0.765 ± 0.010 0.409 ± 0.023 0.901 ± 0.038 0.801 ± 0.008 0.915 ± 0.036 AKH 0.938 ± 0.006 0.679 ± 0.007 0.539 ± 0.017 0.853 ± 0.014 0.738 ± 0.012 0.878 ± 0.010 IrisFP 0.960 ± 0.007 0.829 ± 0.024 0.656 ± 0.045 0.939 ± 0.060 0.766 ± 0.114 0.948 ± 0.003 8.2. The impact of number of queries 20 50 100 200 400 800 0.7 0.8 0.9 1.0 AUC Number of queries Fashion-MNIST CIF AR-100 Tin y -I m a ge N et Figure 8. A UCs for different number of queries. W e ev aluate how the total number of queries used for verification affects the overall performance. As shown in Figure 8 , the A UC increases steadily as the number of queries gro ws from 20 to 800 across all settings, demon- strating that using more queries improves v erification relia- bility . Howe ver , once the number of queries reaches around 200, the A UC values become lar gely stable, and further in- creasing the query count yields only marginal gains. For instance, MobileNet-V2 on Fashion-MNIST , the A UC rises sharply from 0.838 at 20 queries to 0.963 at 200, while the improvement beyond 200 (up to 800 queries) is less than 0.8%. This result suggests that a sufficient number of queries is essential for stable verification, but excessi ve queries provide diminishing returns. In practice, around 200 queries are suf ficient to achie ve near-optimal verification performance while maintaining computational efficienc y . 8.3. Impact of τ W e assess the impact of τ on verification performance. As shown in Figure 9 , all three settings exhibit a similar trend: both small and large values of τ degrade A UC slightly , while a moderate one yields the best performance. This is because a small τ leads to fingerprints that are ov erly sensitiv e to decision boundary shifts, whereas a large one reduces the fingerprint’ s uniqueness. The optimal τ val ue depends on the total number of classes of each task: the larger the total number of classes, the smaller the value of τ should relativ ely be, and vice v ersa. The purpose of this scaling is to ensure that the fingerprints maintain a certain distance from multiple decision boundaries. T o impro ve the discriminativ e po wer , we need to optimize τ to find an in- termediate value that achiev es a balance between robustness and uniqueness. 0.01 0.05 0.1 0.2 0.5 0.8 0.9 1.0 0.01 0.05 0.1 0.2 0.5 0.8 0.9 1.0 0.01 0.05 0.1 0.2 0.5 0.8 0.9 1.0 AUC (a) (b) (c) AUC AUC τ τ τ Figure 9. A UCs under dif ferent τ values for ResNet-18 on CIF AR- 100 (a), MobileNet-V2 on Fashion-MNIST (b), and V iT -B/16 on T iny-ImageNet (c). 8.4. Impact of α W e study the impact of the verification threshold α on fingerprint verification performance across three differ - ent settings—ResNet-18 on CIF AR-100, MobileNet-V2 on Fashion-MNIST , and V iT -B/16 on Tin y-ImageNet. The ev aluation metric is the overall accurac y defined by Accuracy = T P + T N T P + F P + T N + F N , (5) where TP , FP , TN, and FN denote true positi ve rate, false positiv e rate, true ne gati ve rate, and false negativ e rate, re- spectiv ely . As shown in Figure 10 , the verification accu- racy varies as α changes. For MobileNet-V2 on Fashion- MNIST , the accuracy increases steadily as the threshold mov es from α = 0 toward the mid-range and reaches a peak of approximately 0 . 937 when α is in the range 0.45–0.50, and then gradually declines as α becomes too large. Like- wise, ResNet-18 and V iT -B/16 follo w a similar trend: accu- racy improv es rapidly when α < 0 . 20 , remains near its op- timum for α between 0 . 45 and 0 . 50 , and deteriorates once α exceeds this range. Overall, these results demonstrate that moderate thresh- old v alues (around 0 . 45 – 0 . 50 ) provide the best v erification performance. Extremely lo w thresholds or extremely high thresholds lead to suboptimal accuracy , highlighting the im- portance of selecting an appropriate decision threshold for stable and reliable fingerprint verification. 8.5. Impact of refer ence model set size W e study the impact of reference model set size, i.e., the number of models in each of the reference model sets V f 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.2 0.4 0.6 0.8 1.0 Accuracy α CIF AR-100 Fashion-MNIST Tin y -I m a ge N et Figure 10. Accuracy for different α . and I f , on the verification performance. As shown in Fig- ure 11 , enlarging these sets consistently improves A UC across all settings—ResNet-18 on CIF AR-100, MobileNet- V2 on Fashion-MNIST , and V iT -B/16 on T iny-ImageNet. For instance, for ResNet-18 on CIF AR-100, the A UC in- creases from 0.792 with one model per set to 0.918 with 12 models per set; for MobileNet-V2 on Fashion-MNIST , it in- creases from 0.830 to 0.969. P articularly , the most substan- tial performance gains occur when increasing the set size from one to six, beyond which the performance is getting stable. These results indicate that our method can conv erge and achiev e high stability of the verification performance with only moderately sized reference sets, underscoring its scalability and efficienc y . 1 3 6 9 12 0.7 0.8 0.9 1.0 AUC Model Set Size Fashion-MNIST CIF AR-100 Tin y -I m a ge N et Figure 11. A UCs for different model set sizes. 9. Time Ov erhead IrisFP’ s time cost during the fingerprint generation process comes from two parts: (1) training the reference model set, and (2) generating the fingerprints. W e compare the time cost of using different model fingerprinting methods. For each method, we generate 200 fingerprints in total. It is notew orthy that IrisFP adopts 40 composite-sample finger- prints, each being composed of 5 samples, ensuring a fair assessment across different methods. Besides, we consider three protected models: a ResNet-18 trained on CIF AR- 100, a MobileNet-V2 trained on Fashion-MNIST , and a V iT -B/16 trained on T iny-ImageNet. The experimental results are summarized in T able 5 . As shown by the table, IrisFP requires 55 m for MobileNet- V2 on Fashion-MNIST , which is an acceptable cost among different methods. More importantly , ev en when switch- ing to a substantially larger model V iT , the total cost only increases to 1 h 32 m , demonstrating that the computa- tional ov erhead of IrisFP scales modestly with the model size. This lev el of computational cost is practical and ac- ceptable in real-world deployments. It is worth noting that for the time of performing ownership verification, we ignore the discussion. This is because querying a target model is nearly real-time, and the methods with the same total num- ber of fingerprints incur same time cost. T able 5. Time overheads of generating fingerprints across different methods. Method IPGuard U AP AD V -TRA AKH IrisFP ResNet-18 59 s 4 h 40 m 32 m 11 s 1 h 2 m MobileNet-V2 41 s 3 h 5 m 24 m 8 s 55 m V iT 1 m 21 s 7 h 30 m 1 h 11 m 1 m 6 s 1 h 32 m 10. Stealthiness Prior fingerprinting studies rarely report “stealthi- ness/imperceptibility”, as enforcing extremely small visual perturbations often weakens the discriminative strength of the fingerprints. Nev ertheless, we add a quali- tativ e e valuation of stealthiness in Fig. 12 . As sho wn in the figure, the generated fingerprints remain visually similar to the original inputs, with only subtle pixel-le v el differences. Original Composite fingerprint (including 5 samples) Figure 12. Qualitative ev aluation of stealthiness. Each row sho ws an original sample and its corresponding composite fingerprint consists of fiv e samples. 11. T ask T ransferability IrisFP targets the standard ownership-v erification setting considered in prior work, where the target model is de- ployed for the same task and shares the same label set as the protected model, e ven after model-modification attacks. If the task changes and the label space differs, verification can still be conducted using the overlapping subset of la- bels, since our fingerprint set spans multiple labels. How- ev er , the verification performance is expected to degrade as the degree of label o verlap decreases. 12. Experimental Setting Details 12.1. Protected Model W e consider three model architectures for the protected model: ResNet-18, MobileNet-V2, and V iT -B/16. For each of the fiv e benchmark datasets (CIF AR-10, CIF AR-100, Fashion-MNIST , MNIST , and Tin y-ImageNet), we inde- pendently train two protected models: one using ResNet-18 and the other using MobileNet-V2. Besides, we also pro- duce a protected model with the V iT -B/16 architecture, a larger and more comple x architecture, by finetuning it on T iny-ImageNet. All ResNet-18 and MobileNet-V2 mod- els follo w standard training configurations—SGD with mo- mentum 0.9, weight decay of 5 × 10 − 4 , an initial learning rate of 0.1, cosine-annealing scheduling, Xa vier initializa- tion, and a batch size of 128 for 600 epochs. F or V iT -B/16, which is generally more suitable for higher-resolution data, we initialize the model from pretrained weights and fine- tune it on T iny-ImageNet under the same settings, except that we use a smaller initial learning rate of 0.01 and train it for 50 epochs. 12.2. Reference Model Sets T wo reference model sets, including the pirated model set and the independently-trained set, serve as the foundation for experimentally assessing the quality of fingerprints dur - ing the refinement process. For each protected model, its corresponding pirated model set is constructed by modify- ing the protected model through three model modification techniques, chosen from six a vailable attack types (FT , PR, KD, A T , PFT , and NFT). These modifications in validate the original fingerprints, resulting in ev ading ownership veri- fication. W e adopt only three of them to balance com- putational cost and to emulate realistic conditions where reference models may not encompass all possible removal attempts. In our experiment, we select fine-tuning (FT), knowledge distillation (KD), and adversarial training (A T). Specifically: • Fine-tuning (FT): Training the protected model for a specified number of epochs. • Knowledge Distillation (KD): T raining a student model— either with the same or a different architecture—under the supervision of the protected (teacher) model by minimiz- ing the KL diver gence between their soft outputs, with temperature set to 1. • Adversarial T raining (A T): Crafting adv ersarial e xamples via a PGD attack (using Foolbox’ s “LinfPGD” tool), con- catenating them with clean inputs, and then training the model on the mixed dataset. For each attack type, we produce three pirated variants in- stantiated with 3 dif ferent random seeds to ensure model, leading to 9 pirated variants in the pirated model set. On the other hand, the easy way for constructing the independent model set is to use public models from an open-source plat- form, e.g., Hugging Face. In our experiments, the model is trained from scratch with dif ferent seeds and hyperparame- ters from training the protected models, following standard training configurations—SGD with momentum 0.9, weight decay of 5 × 10 − 4 , an initial learning rate of 0.1, cosine- annealing scheduling, Xavier initialization, and a batch size of 128 for 600 epochs. Specifically , our independent models are constructed by using three di verse architectures av ail- able in the torchvision library—ResNet-18, MobileNet-V2, and DenseNet-121—each type instantiated with 3 dif ferent random seeds to create 3 independently-trained models, re- sulting in a total of 9 models in the independent model set. T o guarantee independence, all ResNet-18 and MobileNet- V2 instances are initialized with seeds distinct from that of the protected model. 12.3. T esting Model Set The testing model set consists of pirated and independently- trained models, which are independently trained and ha ve no ov erlap with the models in the reference model sets. The pirated models in the testing set are constructed us- ing six representativ e attacks: • Fine-tuning (FT): Training the protected model for a specified number of epochs. • Pruning (PR): Applying unstructured global pruning with sparsity lev els ranging from 10% to 90% in increments of 10%, without retraining. • Prune-then-tune (PFT): Applying pruning at sparsity lev- els of 30%, 60%, and 90%, followed by fine-tuning. • Noise-regularized Fine-tuning (NFT): Perturbing each trainable parameter tensor with Gaussian noise scaled by a scaled magnitude of standard deviation of that layer , i.e., par am + = α · std ( par am ) · N (0 , 1) , follo wed by fine- tuning. • Knowledge Distillation (KD): T raining a student model— of the same or a dif ferent architecture—using the KL di- ver gence between its outputs and those of the protected model (temperature = 1). • Adversarial T raining (A T): Crafting adv ersarial e xamples via a PGD attack (using Foolbox’ s “LinfPGD” tool), con- catenating them with clean inputs, and then training the model on the mixed dataset. For each attack type, we generate 20 v ariants with dif ferent random seeds, resulting in a total 120 pirated models. The independently-trained models in the testing set are built from scratch using six architectures: ResNet- 18, ResNet-50, MobileNet-V2, MobileNet-V3 Large, EfficientNet-B2, and DenseNet-121. For each architecture, 20 models are trained with distinct seeds, without any ac- cess to the protected models or the models in the reference model set, yielding 120 independently-trained models. 12.4. Hyperparameters W e use K = 40 composite-sample fingerprints, each con- sisting of T = 5 samples for all tasks. The bias parameter τ is 0.2 for CIF AR-10, Fashion-MNIST , and MNIST , 0.1 for CIF AR-100, and 0.05 for Tin y-ImageNet. W e set the regularization coef ficients λ 1 and λ 2 to 0.05. The training process consists of 800 iterations in Phase I and 200 itera- tions in Phase II. 12.5. Computing Infrastructure All e xperiments were run on a Linux server equipped with an NVIDIA A100 GPU and 200 GB of system memory . GPU acceleration was provided by CUD A 12.1, and all models and training pipelines were implemented in PyT orch 2.5.1. 13. Related W ork Model fingerprinting has emerged as a promising non- intrusiv e technique for protecting the intellectual property (IP) of deep neural networks (DNNs), which are highly sus- ceptible to unauthorized use [ 3 , 7 , 21 , 26 , 31 , 35 , 36 , 38 ]. By e xploiting the distinctiv e behavioral patterns of a pro- tected model, fingerprinting deriv es verifiable ownership evidence[ 5 , 37 ]. Researchers ha ve dev eloped various model fingerprint- ing methods. For example, sev eral fingerprinting methods assess model similarity through internal representations or model parameters instead of adversarial queries. Maho et al . [ 22 ] propose a greedy scheme that generates fingerprints for a model by analyzing the statistical similarity of inter- mediate acti v ations from benign inputs using Shannon’ s in- formation theory . Guan et al . [ 7 ] identify suspicious models by selecting inputs that yield inconsistent predictions across two sets of reference models and examining their pairwise relationships. Other methods treat the model parameters themselves as intrinsic fingerprints. F or instance, Jia et al . [ 10 ] train linear proxy models on a reference dataset and compare their learned weights via cosine similarity , while Zheng et al . [ 39 ] project front-layer weights into a ran- dom subspace associated with the model owner’ s identity to achiev e non-repudiable o wnership verification. In the recent years, researchers have gained great at- tention to the adv ersarial-example-based fingerprinting and hav e dev eloped many methods for it. This type of fin- gerprinting lev erages adversarial examples to craft in- put–output pairs that exhibit model-specific behavioral pat- terns. For instance, Cao et al . [ 3 ] re veal that a model’ s de- cision boundary reflects its unique identity and generates data points (adversarial samples) near a decision boundary to serve as distinctiv e fingerprints. Lukas et al . [ 21 ] op- timizes adversarial examples through multi-model training to maximize their transferability to pirated models. Ho w- ev er , such transferability often leads to false positiv es. T o address this, Peng et al . [ 23 ] employ Uni versal Adversarial Perturbations to modify clean inputs into fingerprint sam- ples, and then use a learned encoder to map model logits into embedding vectors for similarity-based o wnership ver - ification. Y ang et al . [ 34 ] train a GAN to synthesize natural fingerprint samples by optimizing inputs that yield div er- gent predictions between pirated and independent models in decision-difference regions, which are then used as black- box queries for verification. Liu et al . [ 20 ] lev erage a con- ditional GAN with mar gin loss to generate samples posi- tioned at controlled distances from the classification bound- ary , enabling robust and distinctiv e fingerprinting without relying on surrogate models. Instead of isolated adversar- ial samples, Xu et al . [ 29 ] construct adversarial trajectories by iterativ ely perturbing clean inputs to tra verse decision- boundary regions, capturing richer and more comprehen- siv e model-specific behaviors. Godinot et al . [ 6 ] introduce a decomposition-based analysis framework that separates fingerprinting into query construction, representation, and detection, and show that one effecti v e instantiation con- structs fingerprints using adversarial examples originating from naturally misclassified samples. Despite these advancements, a fundamental challenge remains: designing fingerprints that simultaneously achie ve uniqueness—distinguishing the protected model from inde- pendently trained ones—and robustness—maintaining va- lidity under model modifications or remov al attacks. Dif- ferent from con ventional approaches that typically construct each fingerprint near a single decision boundary , we in ves- tigate the region near the intersection of multiple decision boundaries and propose a composite-sample fingerprinting framew ork that jointly enhances both uniqueness and ro- bustness.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment