On Secure Distributed Data Storage Under Repair Dynamics
We address the problem of securing distributed storage systems against passive eavesdroppers that can observe a limited number of storage nodes. An important aspect of these systems is node failures over time, which demand a repair mechanism aimed at…
Authors: Sameer Pawar, Salim El Rouayheb, Kannan Ramch
Data storage devices have evolved significantly since the days of punched cards. Nevertheless, storage devices, such as hard disks or flash drives, are still bound to fail after long periods of usage, risking the loss of valuable data. To solve this problem and to increase the reliability of the stored data, multiple storage nodes can be networked together to redundantly store the data, thus forming a distributed data storage system. Applications of such systems are innumerable and include large data centers and peer-to-peer storage systems, such as OceanStore [1], that use a large number of nodes spread widely across the Internet to store files.
Codes for protecting data from erasures have been well studied in classical channel coding theory, and can be used here to increase the reliability of distributed storage systems. Fig. 1 illustrates an example where a maximal distance separable (MDS) code is used to store a file F of 4 symbols, (a 1 , a 2 , b 1 , b 2 ) ∈ F 4 5 , distributively on 4 nodes, v 1 , . . . , v 4 , each of capacity 2 symbols. The MDS code implemented here ensures that any user, also called data collector, connecting to any 2 storage nodes can obtain the whole file F . However, what distinguishes the scenario here from the erasure channel counterpart is that when a storage node fails, it needs to be repaired or replaced by a new node in order to maintain a desired level of system reliability. A straightforward repair mechanism would be to add a new replacement node of capacity 2, and make it act as a data collector by connecting This research was funded in part by an AFOSR grant (FA9550-09-1-0120), a DTRA grant (HDTRA1-09-1-0032), and an NSF grant (CCF-0830788). to 2 surviving nodes. The new node can then download the whole file (4 symbols) to construct the lost part of the data and store it. Another repair scheme that consumes less bandwidth is depicted in Fig. 1 where node v 1 fails and is replaced by node v 5 . When node v 5 connect to 3 nodes instead of 2, it is possible to decrease the total repair bandwidth from 4 to 3 symbols. Note that v 5 does not need to store the exact data that was on v 1 ; the only required property is that the data stored on all the active nodes v 2 , v 3 , v 4 and v 5 form an MDS code.
The above important observations were the basis of the original work of [2] where the authors showed that there exists a fundamental tradeoff between the storage capacity of each node and the repair bandwidth. They also introduced and constructed "regenerating codes" as a new class of codes that generalize classical erasure codes and permit the operation of a distributed storage system at any point on the tradeoff curve. When a distributed data storage system is formed using nodes widely spread across the Internet, e.g., Internet based peer-to-peer systems, individual nodes may not be secure and can become susceptible to eavesdropping. This paper focuses on such scenarios where an eavesdropper can gain access to a certain number of the storage nodes. The compromised distributed storage system is always assumed to be dynamic with nodes continually failing and being repaired. Thus, the compromised nodes can belong to the original set of storage nodes that the system starts with, or even include some of the replacement nodes added to the system to repair it from failures. Under this setting, we are interested in determining how much data can still be stored in the system without revealing any information to any of the eavesdroppers.
To answer this question, we follow the approach of [2] and model the distributed storage system as a multicast network that uses network coding. Under this model, the eavesdropper is an intruder that can access a fixed number of the network nodes of her choice. This eavesdropper model is natural for distributed storage systems and comes in contrast with the wiretapper model studied in the network coding literature [3], [4], [5] where the intruder can observe network edges, instead of nodes. We derive a general upper bound on the secrecy capacity as a function of the node storage capacity and the repair bandwidth. Motivated by system considerations, we define an important operating regime, that we call the bandwidth-limited regime, where the repair bandwidth is constrained not to exceed a given upper bound, while no limitation is imposed on the storage capacity of the nodes. For this important operating regime, we show that our upper bound is tight and present capacity-achieving codes.
This paper is organized as follows. In Section II we describe the system and security model. We define the problem and give a summary of our results in Section III. In Section IV we illustrate two special cases of distributed storage systems that are instructive in understanding the general problem. In Section V, we derive an upper bound on the secrecy capacity, and in Section VI, we present a scheme that achieves this upper bound for the case of bandwidth-limited regime. We conclude in Section VII.
A distributed storage system (DSS) is a collection of storage nodes that includes a source node s, that has an incompressible data file F of R symbols, or units, each belonging to a finite field F. The source node is connected to n storage nodes v 1 , . . . , v n , each with a storage capacity of α symbols, which may be utilized to save coded parts of the file F . The storage nodes are individually unreliable and may fail over time. To guarantee a certain desired level of reliability, we assume that the DSS is required to always have n active, i.e., non-failed, storage nodes that are in service. Therefore, when a storage node fails, it is immediately replaced by a new node with same storage capacity α. The DSS should be designed in a way to allow any legitimate user, that we also call data collector, that connects to any k out of the n active storage nodes available at any given time, to be able to reconstruct the original file F . We term this condition as the "reconstruction property" of distributed storage systems.
We assume that nodes fail one at a time, and we denote by v n+i the new replacement node added to the system to repair the i-th failure. The new replacement node connects then to some d nodes, chosen randomly, out of the remaining active n -1 nodes and downloads γ units from them in total, which corresponds to the repair bandwidth of the system. The repair degree d is a system parameter satisfying k ≤ d ≤ n-1. In this work, we focus on the case of symmetrical repair where the new node downloads equal amount of data, say β units, from each of the d nodes it connects to, i.e., γ = dβ. The process of replenishing redundancy to maintain the reliability of a DSS is referred to as the "regeneration" or "repair" process. Note that a new replacement node may download more data than what it actually stores. Moreover, the stored data can possibly be different than the one that was stored on the failed node, as long as the "reconstruction property" of the DSS is retained. A distributed storage system D is thus characterized as D(n, k). For instance, the DSS depicted in Fig. 1 corresponds to D(4, 2) which is operating at (α, γ) = (2, 3).
We adopt the flow graph model introduced in [2] which we describe here for completeness. In this model, the distributed storage system is represented by an information flow graph G. The graph G is a directed acyclic graph with capacity constrained edges that consists of three kinds of nodes: a single source node s, input storage nodes x i in and output storage nodes x i out and data collectors DC j for i, j ∈ {1, 2, . . . }. The source node s has an information S of which a specific realization is the file F . Each storage node v i in the DSS is represented by two nodes x i in and x i out joined by a directed edge of capacity α (see Fig. 2), to account for the node storage constraint.
The repair process is initiated every time a failure occurs. As a result, the DSS, and consequently the flow graph, are dynamic and evolve with time. At any given time, each node in the graph is either active or inactive depending on whether it has failed or not. The graph G starts with only the source node s being active and connected to the storage input nodes x 1 in , . . . , x n in by outgoing edges of infinite capacity. From this point onwards, the source node s becomes and remains inactive and the n input and output storage nodes become active. When a node v i fails in a DSS, the corresponding nodes x i in and x i out become inactive in G. If a replacement node v j joins the DSS in the process of repairing a failure and connects to d active nodes v i1 , . . . , v i d , the corresponding nodes x j in and x j out , with the edge (x j in , x j out ), are added to the flow graph G, and node x j in is connected to the nodes x i1 out , . . . , x i d out by incoming edges of capacity β each. A data collector is represented by a node connected to k active storage output nodes through infinite capacity links enabling it to reconstruct the file F . The graph G constitutes a multicast network with the data collectors as destinations. An underlying assumption here is that the flow graph corresponding to a distributed storage system depends on the sequence of failed nodes. As an example, we depict in Fig. 2 the flow graph corresponding to the DSS D(4, 2) of Fig. 1, when node v 1 fails.
We assume the presence of an intruder "Eve" in the DSS, who can observe up to ℓ, ℓ < k, nodes of her choice among all the storage nodes, v 1 , v 2 , . . . , possibly at different time instances as the system evolves. In the flow graph model, Eve is an eavesdropper who can access a fixed number ℓ of nodes chosen from the storage input nodes x 1 in , x 2 in , . . . . Notice that while a data collector observes output storage nodes, i.e., the data stored on the nodes it connects to, Eve, has access to input storage nodes, and thus can observe, in addition to the stored data, all incoming messages to these nodes. We also assume that Eve has complete knowledge of the storage and repair schemes implemented in the DSS. Thus, she can choose some of the ℓ nodes to be among the initial n storage nodes, or, if she deems it more profitable, she can choose to wait for failures and eavesdrop on a replacement node by observing its downloaded data. Eve is assumed to be passive, and only observes the data without modifying it.
Let S be a random vector uniformly distributed over F R q , representing the incompressible data file at the source node with H(S) = R. Let V in := {x 1 in , x 2 in , . . . } and V out := {x 1 out , x 2 out , . . . } be the sets of input and output storage nodes in G respectively. For a storage node v i , let D i and C i be the random variables representing its downloaded messages and stored content respectively. Thus, C i , represents the data that can be downloaded by a data collector when contacting node v i , while D i , with H(D i ) ≤ γ, represents the total data revealed to Eve when she accesses node v i . The stored data C i is a function of the downloaded data D i .
Let V a out be the collection of all subsets of V out of cardinality k consisting of nodes that are simultaneously active at some instant in time. For any subset B of V out , define
The reconstruction property, then, can be written as
and the perfect secrecy condition implies
Given a DSS D(n, k) with ℓ compromised nodes, its secrecy capacity, denoted by C s (α, γ), is then defined to be the maximum amount of data that can be stored in this system such that the reconstruction property and the perfect secrecy condition are simultaneously satisfied for all possible data collectors and eavesdroppers i.e., C s (α, γ) := sup
where B ∈ V a out , E ⊂ V in and |E| ≤ ℓ.
First, we give the following general upper bound on the secrecy capacity of a DSS:
Theorem 1: [Upper Bound] For a distributed data storage system D(n, k), with a repair degree d, and ℓ < k compromised nodes, the secrecy capacity is upper bounded as
where γ = dβ.
Next, we consider an important operational regime, namely the bandwidth-limited regime, where the repair bandwidth γ is constrained to a maximum amount Γ, i.e., γ ≤ Γ, while no constraint is imposed on the storage capacity α at each node. The secrecy capacity in this regime is defined as,
For a fixed Γ, when the parameter d is a system design choice, the upper bound of Theorem 1 on the secrecy capacity can be further optimized, and attains a maximum for d = n -1.
In section VI, we demonstrate that this upper bound can be achieved for d = n-1 in the bandwidth-limited regime. Thus, establishing the following theorem: Theorem 2: [Bandwidth-Limited Regime] For a distributed data storage system D(n, k), ℓ < k compromised nodes, the secrecy capacity for a bandwidth-limited regime, for d = n-1, is
and is achieved with a storage capacity of α = Γ.
A static version of the problem studied here corresponds to a DSS with ideal storage nodes that do not fail. Hence there is no need for any repair in the system. The flow graph of this system is then the combination network studied in network coding theory (see for e.g. [6,Chap. 4] ). Therefore, the static storage problem can be regarded as a special case of wiretap networks [3], [4], or equivalently, as the erasureerasure wiretap-II channel studied in [7]. The secrecy capacity for such systems is (k -ℓ)α, and can be achieved using either nested MDS codes [7], or the coset codes of [8], [4].
Even though the above proposed solution is optimal for the static case, it can have a very poor secrecy performance when applied directly to dynamic storage systems with failures. For instance, a straightforward way to repair a failed node would be to download the whole file on the new replacement node, and then generate the specific lost data. In this case, if Eve accesses the new replacement node while it is downloading the whole file, it will be able to reconstruct the entire original data. Hence, the secrecy rate for this scheme would be zero. However, Theorem 2 suggests that for some systems we can achieve a positive secrecy capacity. This example highlights the fact that dynamical repair of the DSS renders it intrinsically different from the static counterpart, and one should be careful in designing the repair scheme in order to safeguard the whole stored data.
Using the flow graph model, the authors of [2] showed that random linear network codes over a large finite field can achieve any point (α, γ), on the optimal storage-repair bandwidth tradeoff curve with a high probability. Consider an example of random linear network code used in a compromised DSS D(4, 3), which stores R = 6 symbols and operates at d = 3, β = 1, and α = 3. In this case, each of the initial nodes v 1 , . . . , v 4 stores 3 independently generated random linear combinations of these R = 6 symbols. Assume now that node v 4 fails and is replaced by a new node v 5 that connects to v 1 , v 2 , and v 3 , and downloads from each one of them β = 1 random linear combination of their stored data. Assume that after some time, node v 5 fails and is replaced by node v 6 in a similar fashion. Now, if ℓ = 2, and Eve accesses nodes v 5 and v 6 while they were being repaired, it will observe 6 linear combinations of the original data symbols, which, with high probability are linearly independent. Therefore, she will be able to reconstruct the whole file.
The above analysis shows that, when random network coding is used, it is not possible to achieve a positive secrecy rate for this system, even with pre-processing at the source, using for example Maximum Rank Distance (MRD) codes [5]. But according to Theorem 2, which we prove in section VI, the secrecy capacity of the the above DSS D(4, 3) is equal to one unit when ℓ = 2. This is also in contrast with the case of multicast networks with compromised edges instead of nodes [3], wherein, random network coding can perform as good as any deterministic secure code [5].
In this section we derive the upper bound of Theorem 1. Consider a DSS D(n, k) with ℓ < k. Assume that the nodes v 1 , v 2 , . . . , v k have failed consecutively, and were replaced during the repair process by the nodes v n+1 , v n+2 , . . . , v n+k respectively as shown in Fig. 3. Now suppose that Eve accesses nodes in E = {v n+1 , v n+2 , . . . , v n+ℓ } while they were being repaired, and consider a data collector connected to the nodes in B = {v n+1 , v n+2 , . . . , v n+k }. The reconstruction property implies H(S|C B ) = 0 by Eq. ( 1), and the perfect secrecy condition implies H(S|D E ) = H(S) by Eq. ( 2). We can therefore write
Inequality (1) follows from the fact that the stored data C E is a function of the downloaded data D E , (2) from, C B\E := {C n+ℓ+1 , . . . , C n+k }, (3) follows from the fact that each node can store at most α units, and for each replacement node we have H(C i ) ≤ H(D i ) ≤ dβ, also from the topology of the network (see Fig. 3). Note that each node x n+i in is connected to each of the nodes x n+1 out , . . . , x n+i-1 out by an edge of capacity β. The upper bound of Theorem 1 follows then directly from the definition of Eq. (3).
Consider again the DSS D(4, 3) with α = 3, d = 3, β = 1, and ℓ = 2 of Section IV-B, for which the secrecy rate using random linear network coding was shown to be 0. The upper bound on the secrecy capacity of this system given by Theorem 1 is 1. We provide a scheme that achieves this upper bound. The proposed code is depicted in Fig. 4 and consists of the concatenation of an MDS coset code [8] with a special repetition code that was introduced in [9] by Rashmi et al. for constructing exact regeneration codes. Let S ∈ F q denote the information symbol to be securely stored on the system. S is encoded using the outer MDS code into a codeword (Z, K 1 , K 2 , . . . , K 5 ), where K 1 , . . . , K 5 are independent random keys uniformly distributed over F q and Z = S + 5 i=1 K i . The encoded symbols Z, K 1 , . . . , K 5 are then stored on the nodes v 1 , . . . , v 4 as shown in Fig. 4, following the special repetition code of [9]. It is easy to verify that any data collector connecting to 3 nodes, observes unit. An MDS coset code takes the information symbol S and five independent random keys K 1 , . . . , K 5 , as an input and outputs a parity check symbol Z = S + 5 i=1 K i , along with random keys in systematic form. These symbols are then stored on the DSS using the code structure of [9]. all the symbols Z, K 1 , . . . , K 5 , and can therefore decode S = Z -5 i=1 K i . However, an eavesdropper accessing any two nodes will only observe 5 symbols out of 6, and cannot gain any information about S. Next, we generalize this construction to obtain a capacity-achieving code for the bandwidth-limited regime.
Our approach builds on the results of [9] where the authors constructed a family of exact regenerating codes for d = n-1.
The "exact" property of these codes allows any repair node to reconstruct and store an identical copy of the data lost upon a failure. For simplicity, we will explain the construction for β = 1, i.e., Γ = n-1. For any larger values of Γ, and in turn of β, the file can be split into chunks, each of which can be separately encoded using the construction corresponding to β = 1. Choose α = Γ. From [2] we know that M = k i=1 (n -i) is the capacity of the above DSS in the absence of any adversary (ℓ = 0). Let R := k i=ℓ+1 (n-i) be the number of information symbols that we would like to store securely on the DSS, and θ := n(n-1)
. Let S = (s 1 , . . . , s R ) ∈ F R q denote the information file and K = (K 1 , . . . , K M-R ) ∈ F M-R q denote M -R independent random keys each uniformly distributed over F q . Then, the proposed code consists of an outer nested (θ, M ) MDS coset code [7] which takes S and K as an input, and outputs X = (x 1 , . . . , x θ ), such that X = KG K + SG S , where G = G K G S is a generator matrix of a (θ, M ) MDS code, and G K in itself is a generator matrix for a (θ, M -R) MDS code. The information vector S effectively selects the coset of the MDS code generated by G K . This outer (θ, M ) MDS code is then followed by the special repetition code introduced in [9] which stores the codeword X on the DSS. The procedure of constructing this inner code can be described using an auxiliary complete graph over n vertices u 1 , . . . , u n that consists of θ edges. Suppose the edges are indexed by the coded symbols x 1 , . . . , x θ . The code then consists of storing on node v i the indices of the edges adjacent to vertex u i in the complete graph. Consequently, every coded symbol x i is stored on exactly two storage nodes, and any pair of two storage nodes have exactly one distinct coded symbol in common, e.g., code in Fig. 4 for n = 4.
This inner code transforms the dynamic storage system into an equivalent static point-to-point channel. First notice that α = Γ, hence all the data downloaded during the repair process, i.e., dβ = Γ, is stored on the new replacement node without any further compression. Thus, accessing a node during repair process, i.e., observing its downloaded data, is equivalent to accessing it after the repair process, i.e., observing its stored data. Second, the exact regeneration codes restore a failed node with the exact lost data. So, even though there are failures and repairs, the data storage system looks exactly the same at any point of time. Any data collector downloads M symbols out of x 1 , . . . , x θ by connecting to k nodes. Moreover, any eavesdropper can observe µ = ℓ i=1 (n -i) = M -R symbols. Thus, the system becomes similar to the erasure-erasure wiretap channel-II of parameters (θ, M, µ) 1 . Therefore, since the outer code is a nested MDS code, from [7] we know that it can achieve the secrecy capacity of M -µ = M -(M -R) = R = k i=ℓ+1 (n -i) of the corresponding erasure-erasure wiretap channel. This rate is achieved for every 1 unit of β. Thus, the total secrecy rate achieved for β = Γ/(n -1) is k i=ℓ+1 (n -i) Γ n-1 . VII. CONCLUSION In this paper we considered dynamic distributed data storage systems that are subject to eavesdropping. Our main objective was to determine the secrecy capacity of such systems, i.e., the maximum amount of data that these systems can store and deliver to data collectors, without revealing any information to the eavesdropper. Modeling such systems as multicast networks with compromised nodes, we gave an upper bound on the secrecy capacity, and showed that it can be achieved in the important bandwidth-limited regime where the nodes have sufficient storage capacity. Finding the general expression of the secrecy capacity of distributed storage systems, and more generally of multicast networks with a fixed number of compromised nodes, remains an open problem that we hope to address in future work.
In the erasure-erasure wiretap channel-II of parameters (θ, M, µ), the transmitter sends θ symbols. A legitimate receiver and an eavesdropper receive M and µ symbols respectively through independent erasure channels[7].
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment