A Critical Look into Threshold Homomorphic Encryption for Private Average Aggregation
Threshold Homomorphic Encryption (Threshold HE) is a good fit for implementing private federated average aggregation, a key operation in Federated Learning (FL). Despite its potential, recent studies have shown that threshold schemes available in mai…
Authors: Miguel Morona-Mínguez, Alberto Pedrouzo-Ulloa, Fern
A Critical Look into Threshold Homomorphic Encryption for Pri v ate A v erage Aggregation Miguel Morona-M ´ ınguez, Alberto Pedrouzo-Ulloa and Fernando P ´ erez-Gonz ´ alez atlanTT ic, Uni versidade de V igo { mmorona, apedrouzo, fperez } @gts.uvigo.es Abstract —Threshold Homomorphic Encryption (Threshold HE) is a good fit for implementing pri vate federated a verage aggregation, a key operation in Federated Learning (FL). Despite its potential, recent studies ha ve shown that thr eshold schemes av ailable in mainstream HE libraries can introduce unexpected security vulnerabilities if an adversary has access to a restricted decryption oracle. This oracle reflects the FL clients’ capacity to collaboratively decrypt the aggregated result without knowing the secret key . This work surveys the use of thr eshold RL WE- based HE f or federated av erage aggregation and examines the performance impact of using smudging noise with a large variance as a countermeasur e. W e provide a detailed comparison of thr eshold variants of BFV and CKKS , finding that CKKS -based aggregations perf orm comparably to BFV -based solutions. Index T erms —Machine Learning, F ederated Learning, Private Aggregation, Homomorphic Encryption, Multiple Keys. I . I N T RO D U C T I O N Federated Learning (FL) was introduced in [1], [2] as a Machine Learning (ML) setting where, under the coordination of a central server , multiple clients collaborate with the aim of solving a learning task. One distinguishing feature of FL relies on the fact that the data used for training is distributed among the different clients, in such a way that datasets remain separate and stored locally in each of the clients’ premises. The learning process (see Fig. 1) is as follo ws: Starting from an initial common ML model w ∗ , an FL protocol typically consists in a series of rounds in which, (1) the clients locally update the common model by partially training with only their own individual datasets, obtaining a local update w i , and (2) the central server or aggreg ator , combines all recei ved local updates from the clients to obtain a new common ML model, which is done by computing an aggreg ation function w ∗ = f Σ ( w 1 , . . . , w L ) . This process can be repeated several times till a preestablished conv ergence criteria, which measures the performance of the trained ML model, is fulfilled. Privacy issues: FL successfully av oids having all the train- ing data gathered by the same centralized entity , subsequently reinforcing data control and priv acy protection by isolating the local datasets in each client premises. Unfortunately , even though local data is not openly shared, if no further protection measures are applied, FL protocols still present significant This is the author-submitted version (preprint) of a paper published in the Proceedings of the 2nd IEEE International Conference on Federated Learning T echnologies and Applications (FL T A 2024). The final version is av ailable in IEEE Xplore: https://doi.org/10.1109/FL T A63145.2024.10840167. Fig. 1: High-level description of the FL protocol. priv acy and security issues [1]. Several works hav e shown how the exchanged clients’ local ML updates can sometimes remember information of the data used during training, thus making them susceptible to various types of inference attacks, especially when adversaries hav e white-box access to both the exchanged local and aggregated updates [3], [4]. By taking a closer look at the FL protocol, we observe that, ev en under a common security model as honest-but-curious with collusions and in contrast to the clients who only recei ve the aggregated ML update, the aggreg ator has direct access to all the individual updates sent by the clients. This additional information accessible to the aggregator aggrav ates the poten- tial pri vac y risks posed by inference attacks, especially if the aggregator is compromised by an adversary . Consequently , this situation underlines the importance of incorporating adequate priv ate aggregation solutions into the FL pipeline. As surveyed in [5], a diverse number of technologies can be used for this purpose, although there is no definiti ve perfect candidate. One prominent example of those technologies corresponds to Homomorphic Encryption (HE) which, when extended to operate with multiple keys [6], [7], seems to precisely cover many of the requirements of priv ate aggregation protocols. Scope of our work: Recent research in the cryptographic community [8], [9] has shown that mainstream HE schemes paired with multiple ke ys can introduce some une xpected secu- rity vulnerabilities. In view of this priv acy weakness, we focus here on how to correctly use modern HE schemes ( BFV [10], [11] for exact arithmetic and CKKS [12] for approximate arithmetic) with multiple keys for the ex ecution of priv ate av erage aggregation. Our main interest lies on cross-silo FL settings, in which the ML model is built from stateful clients, who are always av ailable and hav e enough computational power . Even so, we note that HE could also be adapted to work reasonably well in the cross-device setting if the clients, who are intermittently av ailable, ha ve still enough computational and memory resources for encryption/decryption. A. Main contributions W e propose priv ate aggregation methods based on the use of HE under multiple keys. Our solutions follow the blueprint of [6] making use of additive secret sharing to deal with secret keys. In particular, our main contribution is threefold: • W e survey the state of the art in threshold HE and its application for implementing efficient priv ate aggreg ation protocols. In particular, we explore in detail the limita- tions of the commonly used IND - CP A security model in characterizing the priv acy vulnerabilities present in threshold HE when applied to FL protocols. W e also discuss the main countermeasures to securely instantiate these schemes under the IND - CP A D security model [13]. • Contrarily to what was previously established in [8], we show how , for the federated av erage rule, the CKKS scheme can still outperform BFV in terms of cipher ex- pansion when a high smudging noise is used to guarantee priv acy . W e study and compare the range of parameters for which one scheme behav es better than the other . • W e briefly discuss ho w , for the FL setting, the ap- proximate nature of CKKS not only does not introduce additional cryptographic weaknesses as stated in [8], but also can bring some extra priv acy protection benefits by naturally adding noise to the aggregated result. B. Notation and structur e Polynomials are denoted with regular lowercase letters, ig- noring the polynomial variable (e.g., a instead of a ( x ) ) when- ev er there is no ambiguity . W e also represent polynomials as column vectors of their coef ficients a . Let R q = Z [ x ] / (1+ x n ) , with q ∈ N and n a power -of-two, be the polynomial ring in the variable x reduced modulo 1 + x n with coefficients be- longing to Z q , but using the set of representatives ( − q/ 2 , q / 2] . Let χ be the error distribution over R q , whose coefficients are independently sampled from a discrete Gaussian with standard deviation σ and truncated support ov er [ − B , B ] . Then, for a finite set S , x ← S denotes sampling x uniformly at random from S , while x ← χ denotes sampling from the distribution χ . Finally , both || a || and || a || refer to the infinity norm of || a || . The rest of the document is org anized as follows: Section II briefly revie ws the state of the art of HE and its application for priv ate aggregation. W e pay special attention to the pri v acy vulnerabilities present in threshold HE when the adversary has access to a restricted decryption oracle. Section III details the baseline BFV and CKKS schemes. It also introduces threshold HE as a natural extension dealing with multiple keys inside the FL setting. W e focus on describing the use of the smudging noise as the main countermeasure against the attacks discussed in [8], [9]. W e study its effects and we provide analytical comparisons among both threshold BFV and threshold CKKS . W e also discuss possible advantages that the approximate nature of CKKS can bring into the FL setting. Finally , Section IV concludes with some future work lines. I I . P R E L I M I NA R I E S : S TA T E O F T H E A RT F O R H E Even though the application of HE for priv ate aggregation has already been demonstrated in several works [14]–[17], the use of more modern lattice-based HE schemes, such as BFV , BGV , and CKKS [18], is more recent. Specifically , these schemes are based on the hardness of the Ring Learning with Errors (RL WE) problem and offer significant ef ficiency advantages by naturally supporting SIMD (Single-Instruction Multiple-Data) operations, which allow packing sev eral val- ues together under the same ciphertext [19]. These features hav e moti vated research into the combination of RL WE- based HE with other complementary priv acy-related tools. For instance, some w orks e xplore their inte gration with verification techniques to address malicious servers [20], [21], and with Differential Pri vac y (DP) to better manage a larger number of colluding dishonest clients in FL [22], among others. A. Limitations for FL deployment with HE An advantage of HE-based aggregations is that they natu- rally align with the communication flow of the described FL protocol between clients and the aggregator . First, the clients encrypt their updates, then the aggregator homomorphically computes the aggregation function f Σ ( w 1 , . . . , w L ) , and fi- nally , the clients can decrypt the aggregated update. Howe ver , these works assume a very simple deployment scenario in which all clients share the same secret and public keys, implicitly imposing a strong non-colluding assumption among the aggregator and those parties with access to the secret ke y . HE and multiple ke ys: T o overcome this limitation, we can upgrade HE schemes to handle ciphertexts under multiple keys [7]. Sev eral works propose methods to homomorphically aggregate ciphertexts encrypted under dif ferent keys [23], [24], but the most versatile option corresponds to threshold v ariants of single-key HE, such as the MHE (Multiparty HE) scheme proposed in [6]. In this scheme, the secret key associated with the public key of a single-key HE scheme is divided into several additiv e shares, which are distributed among the clients. The secret shares are generated during a setup phase that is run offline before starting the FL protocol. Note that MHE enables encryption of the local updates under a common secret key that is unknown to any single party , requiring decryption to be done collaborativ ely by all clients. In return for the need for a setup phase, threshold HE schemes scale very well with the number of clients and allow for easy reuse of results from single-key HE. For example, in [23], [24], efficient homomorphic aggregations are limited to av erage aggregation. In contrast, with threshold schemes, we could utilize the results of [25], where the authors studied the homomorphic evaluation of several Byzantine- robust aggregation rules (e.g., the median). Unfortunately , in [8], [9], the authors demonstrate how threshold HE schemes are particularly vulnerable when there is a non-negligible probability of incorrect decryption. Both works exemplify ho w this vulnerability can be exploited to implement effecti ve key- recov ery attacks against sev eral mainstream HE libraries. B. Security models for HE: F rom IND - CP A to IND - CP A D Previous works [6] using threshold HE discussed how these schemes could be proven to be secure under the IND - CP A (Indistinguishability under Chosen-Plaintext Attack) security model by relying on the difficulty of breaking the RL WE problem. Under this passi ve security model, the adversary only has access to encryptions by means of an encryption oracle. Note that this security model is sufficient from the aggregator’ s perspective, who only sees RL WE ciphertexts and the public key . Ho we ver , the absence of a decryption oracle does not fully capture the peculiarities of the aggregation functionality , especially from the clients’ perspecti ve, who can collaborativ ely decrypt ciphertexts without knowing the secret key . The consequences of having a decryption oracle for HE schemes is precisely an issue which is being explored since the introduction of the IND - CP A D security model in [13], [26]. I I I . A N A N A L Y S I S O F P R I V AT E A G G R E G A T I O N W I T H H E This section surveys the av ailable algorithms to homomor- phically ev aluate f Σ ( · ) in an FL protocol. W e start by gi ving a brief overvie w of how HE fits inside the FL protocol. Next, we introduce additi ve HE and describe how a threshold v ariant can be b uilt on top of the single-ke y counterparts. W e then focus on the IND - CP A D security model and the use of smudging noise as a countermeasure against the key-recov ery attacks proposed in [8], [9]. Finally , we analyze the performance degradation caused by the large smudging noise, and provide analytical comparisons among different solutions for priv ate aggregation based on threshold BFV and threshold CKKS . W e indicate for which protocol parameters one scheme outperforms the other . General Overview: The incorporation of HE to securely compute the aggregation function inside the FL protocol obeys the following two principles: (1) unprotected local data is isolated in different silos and remains in the clients’ premises, and (2) all local updates leaving a silo are previously encrypted with HE. This w ork follo ws the blueprint of the threshold Mul- tiparty Homomorphic Encryption (MHE) scheme proposed in [6] (see Section III-B for further details). Figure 2 sketches the process followed to train a model. Note that steps 2 - 5 are typically repeated several times till conv ergence is achie ved: 1) Setup step : All in volv ed clients generate individual se- cret and a collectiv e public ke ys. An initial common ML model is agreed among the Clients and the Aggregator . 2) Local training step : Each Client trains locally with its data a local ML update. 3) Input step : Each Client encrypts its local ML update to be outsourced under a collectiv e public key . 4) Evaluation step : The aggregator will be in charge of homomorphically computing the functionality f Σ . 5) Output step : Finally , all in volv ed parties collaboratively decrypt the aggregated ML update. Fig. 2: Protocol for priv ate av erage aggreg ation. A. Backgr ound: Single-K e y HE As our interest in this work relies on the computation of av erage aggreg ations w ∗ = f Σ ( w 1 , . . . , w L ) , we will restrict ourselves to additiv e HE schemes. For ease of exposition we will focus on the case of addition f Σ ( w 1 , . . . , w L ) = P i w i . Definition 1 (Additi ve Homomorphic Encryption) . Let E = { Setup , SecKeyGen , PubKeyGen , Enc , Dec } be an asymmet- ric encryption scheme, whose security is parameterized by λ . An additiv e homomorphic encryption scheme extends E with the Add procedure, having E ahe = { E . Setup , E . SecKeyGen , E . PubKeyGen , E . Enc , E . Dec , Add } , where: • Let c v and c w be two ciphertexts encrypting, respectively , v and w . The procedure Add satisfies: E ahe . Dec ( E ahe . Add ( c v , c w )) = v + w . (1) W e hav e chosen two state-of-the-art RL WE-based cryp- tosystems to instantiate the E ahe scheme which is informally introduced in Definition 1. Specifically , they correspond to BFV [10], [11] and CKKS [12], being each of them, respec- tiv ely , a representati ve of an exact and an approximate HE scheme. W e refer the reader to T able I for more details on their primitives and parameters. Both schemes allow us to perform SIMD-style additions by packing a total of n v alues. 1 Howe ver , they still present some significant differences regarding correctness [27]. While for BFV we can choose adequate parameters which satisfy Eq. (1) for all the homomorphic additions needed to compute f Σ ( · ) , capturing the behavior of approximate HE requires to relax the exact correctness of E . This can be done by considering E ahe . Dec ( E ahe . Add ( c v , c w )) ≈ v + w instead. Now , given a 1 For the more general case of SIMD-style additions and multiplications, we can pack in each separate ciphertext a total of, respectively , n and n/ 2 values for BFV and CKKS . T ABLE I: Summary of BFV & CKKS HE Schemes Parameters remarks BFV & CKKS : R q is the ciphertext ring. BFV : ∆ = ⌊ q t ⌋ . R t is the plaintext ring, with q > t . CKKS : Given an arbitrary target margin error ϵ and B , we can set the scale ∆ . Cryptographic Primitives E ahe . Setup () = pp BFV Select pp ← ( t, n, q , σ, B , ∆) CKKS Select pp ← ( ϵ, n, q , σ, B , ∆) E ahe . SecKeyGen (1 λ ) = sk Giv en the security parameter λ , sample s ← R 3 . Output sk = s E ahe . PubKeyGen ( sk ) = pk Sample p 1 ← R q , and e ← χ . pk = ( p 0 , p 1 ) = ( − sk · p 1 + e, p 1 ) E ahe . Enc ( pk , m ) = ct Sample u ← R 3 and e 0 , e 1 ← χ . ct = ( c 0 , c 1 ) = (∆ m + u · p 0 + e 0 , u · p 1 + e 1 ) E ahe . Add ( ct , ct ′ ) = ct add Giv en ct = ( c 0 , c 1 ) and ct ′ = ( c ′ 0 , c ′ 1 ) , ct add = ( c 0 + c ′ 0 , c 1 + c ′ 1 ) E ahe . Dec ( sk , ct ) = m BFV m = [ ⌊ t q [ c 0 + c 1 · s ] q ⌉ ] t CKKS m ≈ [ c 0 + c 1 · s ] q ∆ Decryption remarks BFV : Decryption correctness holds if (2 n + 1) B < q 2 t − t 2 (see Lemma 1), being n > δ R an upper bound for the expansion factor δ R of the ring R [28]. CKKS : After decryption, we obtain ˜ m = m + e ct ≈ m , where e ct is the internal error of the ciphertext, with || e ct || < (2 n +1) · B ∆ . certain error margin ϵ > 0 , we say that the CKKS scheme is approximately correct, if the following holds: || f ( w 1 , . . . , w L ) − E ahe . Dec ( f ( c w 1 , . . . , c w L )) || < ϵ. Some details on the HE schemes: T o ensure a fair comparison between BFV and CKKS , we have made some minor modifications to the original CKKS scheme described in [12]. These changes primarily in volve sampling the ran- domness in both BFV and CKKS from the same distributions. Additionally , CKKS and BFV mak e use o f different slot packing methods [12], [19], which produce some variations in the operations performed before and after encryption and decryption. Howe ver , because our interest in this work lies in homomorphic additions, for which SIMD-type operations hold without packing, we have remov ed this step in both schemes. Consequently , some of the procedures described in T able I become identical for both schemes. B. A Pr otocol for Private Aggr e gation with Multiparty HE T o securely ev aluate the aggregation functionality f Σ ( · ) , we resort to the extension of the single-key HE schemes from Section III-A into their threshold variants. In particular , we make use of the Multiparty Homomorphic Encryption (MHE) scheme introduced in [6] as a solution to the following multiparty aggr e gation pr oblem : Definition 2 (Adapted from Def. 1 in [29] to our FL setting) . Let C = { C 1 , C 2 , . . . , C L } be a set of L clients, where each client C i holds an input w i ( input and r eceiver parties ). Let f Σ ( w 1 , w 2 , . . . , w L ) = w ∗ be the a verage aggregation func- tion ( f Σ is called ideal functionality ) o ver the input parties . Let T ABLE II: Multiparty Priv ate Aggregation Protocol Public input : ideal aggre gation functionality f Σ to be computed Private input : w i for each C i ∈ C Output for all clients in C : w ∗ = f ( w 1 , w 2 , . . . , w L ) Setup All clients C i instantiate the multiparty homomor- phic scheme E mhe sk i = E mhe .π i, SecKeyGen ( λ, κ ) * cpk = E mhe .π PubKeyGen ( κ, sk 1 , . . . , sk L ) Input Each C i encrypts its input w i and provides it to the Aggr egator + c w i = E mhe . Enc ( cpk , w i ) Evaluation The Aggr e gator computes the encrypted output for the ideal functionality f Σ relying on E mhe . Add c w ∗ = f Σ ( c w 1 , c w 2 , . . . , c w L ) Output The parties in C execute the decryption protocol w ∗ = E mhe .π Dec ( sk 1 , . . . , sk L , c w ∗ ) * κ parameterizes the homomorphic capacity of the scheme E Add . + The Aggre gator is in charge of performing the aggregation. A be a static semi-honest adversary that can corrupt up to L − 1 clients in C and let C A be the subset of clients corrupted by A . Then, the secur e multiparty aggr e gation pr oblem consists in providing C with w ∗ , yet A must learn nothing more about { w i } C i / ∈C A than what can be deduced from the inputs { w i } C i ∈C A and output w ∗ it controls (this property is called input privacy ). Consequently , a solution to the secure multiparty aggr e ga- tion problem consists in a protocol π f Σ which realizes the ideal functionality f Σ while also preserving the input privacy [30]. W e can make use of the general MHE-based solution proposed in [6], which is proved on the Common Reference String (CRS) model, 2 and it also assumes that all parties are connected through authenticated channels. The description of our particularized protocol for priv ate aggregation is described in T able II. It relies on the existence of the following aug- mented multiparty encryption scheme: Definition 3 (Multiparty Additi ve HE) . Let E ahe be the asym- metric and additi ve homomorphic encryption scheme from Definition 1, whose security is parameterized by λ (see T able I for two concrete examples with BFV and CKKS ), and let S = ( S . Sha re , S . Combine ) be an L -party secret sharing scheme. The associated multiparty homomorphic encryption scheme ( E mhe ) is obtained by applying the secret-sharing scheme S to E ahe ’ s secret key sk ( ideal secr et key ) and is defined as the tuple E mhe = ( E ahe . Enc , E ahe . Dec , E ahe . Add , E ahe S ) , where E ahe S = ( π SecKeyGen , π PubKeyGen , π Dec ) is a set of multiparty protocols executed among the clients in the set C , and having the following private ideal functionalities for each client C i : • Ideal secret-k e y generation : f i,π SecKeyGen ( λ ) = S . Sha re i ( E ahe . SecKeyGen ( λ )) = sk i . • Collective public-key generation : f π PubKeyGen ( sk 1 , sk 2 , . . . , sk L ) = E ahe . PubKeyGen ( S . Combine ( sk 1 , sk 2 , . . . , sk L )) . • Collective decryption : 2 It assumes that all parties hav e access to a common random string. f π Dec ( sk 1 , sk 2 , . . . , sk L , ct ) = E ahe . Dec ( S . Combine ( sk 1 , sk 2 , . . . , sk L ) , ct ) . Definition 3 particularizes Def. 2 from [29] to the case in which the encryption scheme E ahe is additive homomorphic. Concrete instantiations f or Multiparty HE: The priv ate ideal functionalities introduced in Definition 3 can be imple- mented by the clients with concrete protocols: • f i,π SecKeyGen ( λ ) : Each client C i independently runs the procedure E Add . SecKeyGen (1 λ ) = sk i . • f π PubKeyGen ( sk 1 , sk 2 , . . . , sk L ) : Given a common random polynomial p 1 , each client C i ∈ C samples e pk ,i ← χ and discloses to the other clients p 0 ,i = − p 1 · sk i + e pk ,i . The collective public key is computed as: cpk = ( X C i ∈C p 0 ,i , p 1 ) = ( − p 1 X i sk i | {z } sk + X i e pk ,i , p 1 ) . • f π Dec ( sk 1 , sk 2 , . . . , sk L , ct ) : The corresponding collabora- tiv e decryption protocol can be divided in two phases. – Given a ciphertext ct = ( c 0 , c 1 ) encrypted under the ideal secret-key sk , each client computes its partial decryption of ct . Each client C i samples e smg ,i ← χ and discloses: h i = sk i · c 1 + e smg ,i . – Given all the decryption shares h i from the clients, the protocol outputs: d = " c 0 + X C i ∈C h i # q = ∆ m + e · v + e 0 + s · e 1 | {z } e ct + X i e smg ,i | {z } e smg . The final step for decryption depends on whether we use BFV or CKKS as the baseline single-key HE scheme: • For BFV , the obtained decryption is m = [ ⌊ t q d ⌉ ] t . • For CKKS , the obtained decryption is m + e ct + e smg ∆ = d ∆ . W e can obtain lo wer bounds for q , similar to the single-key counterparts, by taking into account the larger noise terms present in the threshold versions (see Section III-D). C. Multiparty HE with Countermeasur es for IND - CP A D The new notion IND - CP A D extends IND - CP A by allo w- ing the adversary to have access to a very restricted de- cryption oracle which can only be used for genuine ci- phertexts, or ciphertexts obtained from genuine ciphertexts through valid homomorphic operations. By adapting the def- inition to our specific case with the aggregation function f Σ , the idea is that an adversary knowing { w 1 , . . . , w L } and f Σ , can also obtain f Σ ( w 1 , . . . , w L ) . Then, if this adversary has access to { Enc ( w 1 ) , . . . , Enc ( w L ) } and ho- momorphically ev aluates the aggregation function f Σ as f Σ ( Enc ( w 1 ) , . . . , Enc ( w L )) , then she should not gain more in- formation from Dec f Σ Enc ( w 1 ) , . . . , Enc ( w L ) than what she can already obtain from f Σ ( w 1 , . . . , w L ) . Unfortunately , in [13] the authors showed how some HE schemes, which are IND - CP A secure, leak information through its dif ference and, hence, are not IND - CP A D secure. An example is the case of approximate HE schemes like CKKS [12], for which the difference Dec ( Enc ( w i )) − w i directly leaks the internal noise of the underlying RL WE sample, then allowing the adversary to break the RL WE indistinguishability assumption. T ill very recently , the cryptographic community believ ed that only approximate HE presented this vulnerability and that exact HE schemes lik e BFV [10], [11], BGV [31] or CGGI [32] were in vulnerable to this type of attacks, hence being these schemes naturally safe under IND - CP A D . Howe ver , a couple of new works [8], [9] hav e sho wn that this vulnerability could also be present in practice for exact schemes, as soon as they present a non-negligible probability of incorrect decryption. Both works ex emplify how this vulnerability can be exploited to implement ef fectiv e key-recov ery attacks against several mainstream HE libraries. In both works [8], [9], the authors illustrate how these types of attacks are especially relev ant when dealing with threshold HE schemes. Fortunately , they also propose sev eral countermeasures to achieve IND - CP A D security when using threshold HE. In particular , the addition during decryption of an adequate smudging noise with λ -independent variance can do the job . It is worth highlighting that [8] showcases how this countermeasure significantly impacts the cryptosystem’ s parameters, subsequently reducing its efficienc y . According to the authors, this ef fect is e ven more pronounced for CKKS , for which using a lar ge-variance smudging noise is likely to sev erely reduce the precision of the decrypted result. 1) Impact of the smudging noise: In the end, giv en a ciphertext ct , the ke y-recov ery attacks mentioned abov e aim at extracting its noise component e ct . If this extraction is feasible, the secret key can be recovered via linear algebra techniques. For approximate HE schemes, this process is relativ ely easy once we hav e access to the decryption. Howe ver , for exact HE schemes, these attacks attempt to force decryption failures as a mechanism to estimate the corresponding e ct component. Therefore, e ven in extreme cases where the post-processed decryption can become unusable, adding a large enough smudging noise after decryption is required to hide an y information leakage regarding the original error e ct . In relation to this countermeasure, recent work by [27] discusses ho w the application-agnostic nature of IND - CP A D often leads to impractically large parameters when adding a large smudging noise. This is due to the fact that e smg must statistically hide e ct for all possible homomorphic circuits which satisfy correctness for the initially chosen cryptosystem parameters with the E ahe . Setup () procedure. The authors pro- pose a relaxation of this definition, termed application-awar e IND - CP A D , which is more suitable for the FL setting. In our w ork, we follow their guidelines and assume that both Clients and the Aggregator will behav e semi-honestly , which here means that they are expected to follow exactly the pr e- scribed instructions to compute c w ∗ = f Σ ( c w 1 , c w 2 , . . . , c w L ) and encrypt { c w 1 , c w 2 , . . . , c w L } (see T ables I and II). Collaborative decryption protocol: Gi ven again ct , once all partial decryptions h i = sk i · c 1 + e smg ,i are gathered, the adversary has direct access to d = ∆ · m + e ct + e smg . Our objectiv e is to obtain e smg such that e ct + e smg is statistically indistinguishable from fresh noise ˜ e smg (see the Smudging Lemma in [33]). T o ensure IND − CP A D security , we adhere to the practical guidelines provided in [8], which recommend a Gaussian smudging noise with variance σ 2 smg = 2 λ σ 2 ct . By applying an upper-bound B = O ( σ ) , 3 this results in B smg = 2 λ 2 B ct (e.g., λ = 128 is suggested in [6]). D. Comparison analysis between BFV and CKKS In [8], the authors discuss how the fact that CKKS is particularly vulnerable to the exposure of decryptions under IND - CP A D makes working with its threshold v ariant, by following the blueprint of [6] (see Section III-B), less useful compared to other exact schemes such as BFV . The main argument provided relies on the fact that the high variance of the required smudging noise is likely to jeopardize the precision of the finally decrypted results. Our aim in this section is to elaborate more precisely on this comparison in the context of FL. In particular , we compare the bit precision achiev ed by the threshold variants of both BFV and CKKS for the homomorphic ev aluation of the av erage aggregation rule. Choosing multiparty cryptosystems’ parameters: W e can upper-bound the noise of a fresh single-ke y ciphertext by (2 n + 1) B , relying on Lemma 1 (see Appendix). T o extend this result to the multiparty HE schemes from Section III-B, we must consider the follo wing: (1) The secret ke y and error for the cpk are L times larger , which implies that the noise of fresh ciphertexts is now upper-bounded by B (2 nL + 1) . (2) During the homomorphic ex ecution of the f Σ functionality , the underlying noise is further increased by a factor of L . (3) Dur- ing the collective decryption of the resulting ct , e very client adds to e ct a smudging noise satisfying || e smg ,i || ≤ B smg . As a result, the final noise e ct + e smg present in the collective decryption of ct is upper-bounded by B MP ct = B ct + LB smg = (1 + L 2 λ/ 2 ) B ct , with B ct = LB (2 nL + 1) . As q = O ( B MP ct ) (see Prop. 1) for both BFV and CKKS , we can see ho w the smudging noise significantly impacts ciphertext size and, sub- sequently , the efficienc y of the priv ate aggregation protocol. More specifically , the remarks for decryption with single- key HE giv en in T able I can be adapted to their threshold variants by considering B MP ct instead of B ct . This results in the expressions: (1) B MP ct < q 2 t − t 2 for Multiparty BFV ( MBFV ) and, (2) ∆ B m + B MP ct < q / 2 , with || m || < B m and m = P L i m i , for Multiparty CKKS ( MCKKS ). A comparison in terms of cipher expansion: From the previous expressions, we can obtain lower -bounds for q , which, by following Prop. 1 (see Appendix), can be used to 3 B = 6 σ is commonly used by HE libraries. compare the cipher expansion of both schemes. In particular , MCKKS has a smaller q than MBFV if: B MP ct > 2∆ CKKS B m − t 2 2( t − 1) , (2) where we use ∆ CKKS to make explicit that we refer to the ∆ from CKKS in T able I. Note that ∆ CKKS and t do not play the same role in MCKKS and MBFV , respectiv ely . T o ensure a fair comparison between both schemes, we express ∆ CKKS in terms of the error margin ϵ = B MP ct / ∆ CKKS introduced in Section III-A. This allows us to compare the achie ved bit precision of MCKKS , represented by log 2 ( ϵ − 1 B m ) , directly with the bit precision log 2 t achiev ed in MBFV . Also, for simplicity in the comparison, we assume that the input plaintexts used in MCKKS are normalized beforehand to ensure an aggregation satisfying B m < 1 . These changes lead to the following inequality: t 2 2 B MP ct + t − 1 > ϵ − 1 . (3) In Figures 3 and 4, we plot the expression from Eq. (3) on a logarithmic scale, using bit precisions log 2 t and log 2 ϵ − 1 . Note that if a point ( log 2 t, log 2 ϵ − 1 ) belongs to the colored area, then the modulus q required for MCKKS is smaller than that of MBFV . In Figure 3, we set parameters to L = 10 , n = 8192 and B = 19 . 2 , while λ ranges ov er the set { 32 , 64 , 96 , 128 } . In Figure 4, we fix the parameters at n = 8192 , B = 19 . 2 , and λ = 128 , while L takes values in { 8 , 128 } . The parameter values are practical enough for the cross-silo setting, and ensure a bit security of at least 128 for all the represented bit precisions. W e observe that while increasing λ and L up to 128 does reduce the range of parameters for which MCKKS outperforms MBFV (the effect is more significant when increasing λ ), the approximate MCKKS compares similarly with MBFV , getting better for very high bit precisions. Finally , by taking a closer look to the figures, we can also see how (3) approximately follows a piecewise linear function with two intervals: • In the first interval, the addend t − 1 dominates the left part of Eq. (3), allowing us to simplify the inequality to approximately log 2 ( t − 1) > log 2 ϵ − 1 . Here, both schemes behave similarly for the same bit precision. • In the second interv al, the addend t 2 2 B MP ct dominates, simplifying the inequality into approximately 2 log 2 t − log 2 B MP ct − 1 > log 2 ϵ − 1 . In this case, the bit precision of MCKKS grows twice as fast as MBFV . Runtime comparisons: The implementation runtimes pre- sented in this section were conducted single-threaded on an In- tel Core i7-10750H CPU @ 2.60GHz × 12 with 31.1 GB. W e considered two aggregation protocols with bit precision log 2 t and log 2 ϵ − 1 for MBFV and MCKKS , respectively . T able III includes three dif ferent parameter sets, and T able IV presents their corresponding implementation runtimes in Lattigo [6]. Further clarifications on approximate HE: There are a few points that we must clarify regarding our comparison. 2 32 64 96 128 l o g 2 ( t ) 5 55 105 155 l o g 2 ( 1 ) (a) λ = 32 2 32 64 96 128 l o g 2 ( t ) 5 55 105 155 l o g 2 ( 1 ) (b) λ = 64 2 32 64 96 128 l o g 2 ( t ) 5 55 105 155 l o g 2 ( 1 ) (c) λ = 96 2 32 64 96 128 l o g 2 ( t ) 5 55 105 155 l o g 2 ( 1 ) (d) λ = 128 Fig. 3: Comparison of q for MBFV and MCKKS varying λ . 2 32 64 96 128 l o g 2 ( t ) 5 55 105 155 l o g 2 ( 1 ) (a) L = 8 2 32 64 96 128 l o g 2 ( t ) 5 55 105 155 l o g 2 ( 1 ) (b) L = 128 Fig. 4: Comparison of q for MBFV and MCKKS varying L . Param. Set 1 BFV & CKKS Set 2 BFV Set 3 CKKS { n, L } { 16384 , 16 } { 16384 , 32 } { 16384 , 32 } { t, ϵ − 1 } [bits] { 45 , 45 } { 60 , −} {− , 60 } { # Limbs , q [bits] } { 4 , 240 } { 10 , 300 } { 9 , 270 } { q BFV , q CKKS } [bits] { 232 , 238 } { 280 , −} {− , 259 } T ABLE III: Example parameter sets for the pri vate aggre gation protocol ( σ = 3 . 2 , λ = 128 and bit securit y = 128 ). Protocol step Par . set 1 Par . set 2 Par . set 3 Col. Key Gen. Client : 4.4 ms Client : 10.3 ms Client : 10.1 ms Encryption Client : 1.1 s Client : 2.3 s Client : 2.1 s Aggregation Agg : 244.3 ms Agg : 1.2 s Agg : 1.1 s Col. Dec. Client : 2.5 s Client : 5.2 s Client : 4.8 s T otal runtime 3.9 s 8.7 s 8 s T ABLE IV: Implementation runtimes for the pri vate aggrega- tion protocol ( N ModelParameters = 1638400 ). Firstly , we have considered a multiplication by t/q during de- cryption, following the BFV description as originally reported in [11]. The rounding error introduced when t does not divide q is precisely the source of the extra addend t 2 2 B MP ct in Eq.(3). If we instead modify the decryption to compute a di vision by ∆ , Eq.(3) would resemble the left interval of Figures 3 and 4. Consequently , ϵ − 1 in MCKKS grows similarly to log 2 t for this modified MBFV . Still, CKKS presents a significant advantage when used for priv ate aggregation because it nat- urally introduces noise into the decrypted result. In contrast, exact HE protects the priv acy of clients’ inputs during ho- momorphic computation, but does not provide any additional priv acy guarantees on the decrypted aggre gated models against inference attacks [5]. In fact, one possibility could be to incorporate DP along with the exact HE computation [22]. This can be done by homomorphically adding noise into the encryptions, which requires sa ving part of the log 2 t av ailable bits. Instead, follo wing our previous analysis, the bit precision ϵ − 1 considered in CKKS does not suf fer any change as the effect of the noise is already contemplated in its definition. The question of whether the noise present in CKKS could be reused to provide theoretical DP guarantees has been very recently explored in [34]. The authors are able to provide a certain priv acy budget for the harder case in which the homomorphic ev aluation makes the internal noise dependent on the input messages. Note that for the av erage aggregation rule used in this work, the situation is simpler as the noise added with CKKS is independent of the input local updates. I V . C O N C L U S I O N S A N D F U T U R E W O R K This work surve ys the use of threshold variants of exact and approximate RL WE-based HE for the efficient implemen- tation of priv ate average aggregation. While these types of schemes seem to be a perfect fit for executing the aggregation primitiv e in FL protocols, recent work has demonstrated the existence of some unexpected security vulnerabilities under the IND - CP A D security model if these HE schemes are not correctly instantiated. W e detail the use of smudging noise with a large v ariance as the main defense proposed by recent works and discuss its impact on performance. W e provide an exhausti ve analysis comparing the effecti ve bit precision that can be achiev ed when applying this defense for two concrete threshold variants of HE schemes, such as BFV and CKKS . Our analysis indicates that CKKS -based average ag- gregations compare well with BFV -based solutions. Moreover , we discuss how the approximate nature of CKKS can provide additional pri vac y guarantees for the decrypted aggregated models against inference attacks. As future work, we intend to explore the use of more complex aggregation rules, optimize the v ariance required for the smudging noise by taking into account the peculiarities of the FL scenario compared to the more general case of IND - CP A D , and provide a more exhausti ve performance comparison between both schemes by deploying them within a practical FL use case. A C K N O W L E D G M E N T GPSC is partially supported by the European Union’ s Hori- zon Europe Framew ork Programme for Research and Innov a- tion Action under project TR UMPET (proj. no. 101070038), by FEDER and Xunta de Galicia under project “Grupos de Referencia Competitiv a” (ED431C 2021/47), by FEDER and MCIN/AEI under project FELDSP AR (TED2021-130624B- C21), and by the “NextGenerationEU/PR TR” under TR UF- FLES and a Margarita Salas grant of the Univ ersidade de V igo. R E F E R E N C E S [1] B. McMahan et al. , “Communication-efficient learning of deep networks from decentralized data, ” in AIST ATS , ser . PMLR, v ol. 54. PMLR, 2017, pp. 1273–1282. [2] P . Kairouz et al. , “ Advances and open problems in federated learning, ” F ound. Tr ends Mach. Learn. , vol. 14, no. 1-2, pp. 1–210, 2021. [3] M. Nasr et al. , “Comprehensive priv acy analysis of deep learning: Passi ve and active white-box inference attacks against centralized and federated learning, ” in SP . IEEE, 2019, pp. 739–753. [4] K. Leino and M. Fredrikson, “Stolen memories: Lev eraging model mem- orization for calibrated white-box membership inference, ” in USENIX . USENIX Association, 2020, pp. 1605–1622. [5] M. Mansouri et al. , “Sok: Secure aggregation based on cryptographic schemes for federated learning, ” Proc. Priv . Enhancing T echnol. , vol. 2023, no. 1, pp. 140–157, 2023. [6] C. Mouchet et al. , “Multiparty homomorphic encryption from ring- learning-with-errors, ” Proc. Priv . Enhancing T echnol. , vol. 2021, no. 4, pp. 291–311, 2021. [7] A. Aloufi et al. , “Computing blindfolded on data homomorphically encrypted under multiple keys: A survey , ” ACM Comput. Surv . , vol. 54, no. 9, pp. 195:1–195:37, 2022. [8] M. Checri et al. , “On the practical CP AD security of ”exact” and threshold FHE schemes and libraries, ” IACR Cryptol. ePrint Ar ch. , p. 116, 2024. [9] J. H. Cheon et al. , “ Attacks against the INDCP A-D security of exact FHE schemes, ” IACR Cryptol. ePrint Arc h. , p. 127, 2024. [10] Z. Brakerski, “Fully homomorphic encryption without modulus switch- ing from classical gapsvp, ” in CRYPTO , ser . LNCS, vol. 7417. Springer , 2012, pp. 868–886. [11] J. Fan and F . V ercauteren, “Somewhat practical fully homomorphic encryption, ” IACR Cryptol. ePrint Arch. , p. 144, 2012. [12] J. H. Cheon et al. , “Homomorphic encryption for arithmetic of approxi- mate numbers, ” in ASIACR YPT , ser . LNCS, vol. 10624. Springer , 2017, pp. 409–437. [13] B. Li and D. Micciancio, “On the security of homomorphic encryption on approximate numbers, ” in EUROCR YPT , ser. LNCS, v ol. 12696. Springer , 2021, pp. 648–677. [14] E. Shi et al. , “Privac y-preserving aggregation of time-series data, ” in NDSS . The Internet Society , 2011. [15] F . Benhamouda et al. , “ A new framework for priv acy-preserving aggre- gation of time-series data, ” ACM Tr ans. Inf. Syst. Secur . , vol. 18, no. 3, pp. 10:1–10:21, 2016. [16] C. Zhang et al. , “Batchcrypt: Efficient homomorphic encryption for cross-silo federated learning, ” in USENIX . USENIX Association, 2020, pp. 493–506. [17] A. Madi et al. , “ A secure federated learning framework using homo- morphic encryption and verifiable computing, ” in RDAAPS , 2021, pp. 1–8. [18] M. R. Albrecht et al. , “Homomorphic encryption standard, ” IACR Cryptol. ePrint Arc h. , p. 939, 2019. [19] N. P . Smart and F . V ercauteren, “Fully homomorphic SIMD operations, ” Des. Codes Cryptogr . , vol. 71, no. 1, pp. 57–81, 2014. [20] D. Fiore et al. , “Efficiently verifiable computation on encrypted data, ” in CCS . ACM, 2014, pp. 844–855. [21] D. F . Aranha et al. , “HELIOPOLIS: verifiable computation over homo- morphically encrypted data from interacti ve oracle proofs is practical, ” IACR Cryptol. ePrint Ar ch. , p. 1949, 2023. [22] A. G. S ´ ebert et al. , “Combining homomorphic encryption and differen- tial privac y in federated learning, ” in PST . IEEE, 2023, pp. 1–7. [23] J. Ma et al. , “Privac y-preserving federated learning based on multi-key homomorphic encryption, ” Int. J . Intell. Syst. , vol. 37, no. 9, pp. 5880– 5901, 2022. [24] A. Pedrouzo-Ulloa et al. , “Practical multi-key homomorphic encryption for more flexible and efficient secure federated av erage aggregation, ” in IEEE CSR . IEEE, 2023, pp. 612–617. [25] A. Choffrut et al. , “Practical homomorphic aggregation for byzantine ML, ” CoRR , vol. abs/2309.05395, 2023. [26] B. Li et al. , “Securing approximate homomorphic encryption using differential privac y , ” in CRYPTO , ser. LNCS, vol. 13507. Springer , 2022, pp. 560–589. [27] A. Alexandru et al. , “ Application-aware approximate homomorphic encryption: Configuring FHE for practical use, ” IACR Cryptol. ePrint Ar ch. , p. 203, 2024. [28] L. de Castro et al. , “Fast vector oblivious linear ev aluation from ring learning with errors, ” in W AHC@CCS , 2021, pp. 29–41. [29] C. Mouchet et al. , “Computing across trust boundaries using distributed homomorphic cryptography , ” IA CR Cryptol. ePrint Arch. , p. 961, 2019. [30] Y . Lindell, “Ho w to Simulate It - A Tutorial on the Simulation Proof Technique, ” in Tutorials on the F oundations of Cryptography . Springer, 2017, pp. 277–346. [31] Z. Brakerski et al. , “(leveled) fully homomorphic encryption without bootstrapping, ” ACM T rans. Comput. Theory , vol. 6, no. 3, pp. 13:1– 13:36, 2014. [32] I. Chillotti et al. , “Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds, ” in ASIACR YPT , ser . LNCS, vol. 10031, 2016, pp. 3–33. [33] G. Asharov et al. , “Multiparty computation with low communica- tion, computation and interaction via threshold fhe, ” in EUR OCRYPT . Springer , 2012, pp. 483–501. [34] T . Ogilvie, “Differential priv acy for free? harnessing the noise in approximate homomorphic encryption, ” in CT -RSA , ser . LNCS, vol. 14643. Springer , 2024, pp. 292–315. A P P E N D I X Lemma 1. W e follow the notation for BFV indicated in T able I, and assume that any e ← χ satisfies || e || ≤ B . For a fresh ciphertext ct = ( c 0 , c 1 ) , we have [ c 0 + c 1 · s ] q = ∆ m + e ct with || e ct || ≤ (2 n + 1) B . This implies that whenever (2 n + 1) B < q 2 t − t 2 , decryption works correctly . Pr oof. W e start by computing [ c 0 + c 1 · s ] q = ∆ m + u · e + e 0 + e 1 · s | {z } e ct , from which we can directly upper -bound the error polynomial e ct as: || e ct || = || u · e + e 0 + e 1 · s || (4) ≤ || u · e || + || e 0 || + || e 1 · s || (5) ≤ δ R B + B + δ R B (6) ≤ nB + B + nB (7) = (2 n + 1) B . (8) Now , the decryption process requires to multiply by t q and apply a final coefficient-wise rounding: t q ([ c 0 + c 1 · s ] q ) = t q (∆ m + e ct ) , which must output m for correct decryption. If we define ∆ = q t − r with 0 ≤ r < 1 , the condition for correct decryption can be expressed equi valently as: || t q ( − r m + e ct ) || < 1 2 . By upper-bounding the expression in the left, we obtain: || t q ( − r m + e ct ) || ≤ t q ( ||− r m || + || e ct || ) < t q ( t 2 + || e ct || ) < 1 / 2 . (9) Therefore, combining the expressions (8) and (9), we see that decryption correctness holds if (2 n + 1) B < q 2 t − t 2 . Proposition 1. Follo wing the descriptions of the primitiv es from T ables I and II, the ciphertext modulus q for MBFV is larger than the q for MCKKS if the next condition is satisfied: B MP ct > 2∆ B m − t 2 2( t − 1) . In Section III-D, we utilize this expression as Eq. 2 to compare the cipher expansion of both schemes. Pr oof. For decryption correctness, both schemes must satisfy: • q MBFV : B MP ct < q 2 t − t 2 . • q MCKKS : ∆ B m + B MP ct < q 2 . Comparing both expressions, we have that q BFV > q CKKS if: 2 tB MP ct + t 2 > 2∆ B m + 2 B MP ct 2( t − 1) B MP ct > 2∆ B m − t 2 B MP ct > 2∆ B m − t 2 2( t − 1) .
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment