^{1}

^{*}

^{2}

With this paper, we propose a network coding based cloud storage scheme. The storage system is in the form of an
*m* *
*n* data array. The
*n *columns stand for n storage nodes, which are comprised of a part of systematic nodes storing source symbols and a part of nonsystematic nodes storing parity symbols. Every row of the data array is a (
*n*,
*k*) systematic Maximum Distance Separable (MDS) code. A source symbol is only involved in the encoding with the unique row; it locates at and is not used by other rows. Such a design significantly decreases the complexity of encoding and decoding. Moreover, in case of single node failures, we use interference alignment to further reduce repair bandwidth. Compared to some existing cloud storage schemes, our scheme significantly reduces resource consumption on storage, update bandwidth and repair bandwidth.

Cloud storage services, such as Dropbox, Microsoft One Drive, and Amazon S3, etc., greatly assist users to manage data at any time and from anywhere. The basic requirement of a cloud storage system is reliability, which can be generally achieved by adding data redundancy. Although the simplest way for redundancy is to store the replica of data in multiple storage nodes, coding has been proved to be more storage efficient and has been playing prominent roles in distributed storage for long time. The techniques of storage coding include redundancy array of independent disks (RAID), erasure coding, and network coding, etc.

A cloud storage system inevitably consumes a variety of network resources. In between, storage, update bandwidth and repair bandwidth are three important performance metrics for evaluating a cloud. Storage measures the memory space on drives occupied by a file. Additionally, in a coding storage system, even the change of a single block of original data will outdate all coding blocks, so the system needs to initiate an update procedure to recalculate, retransmit, and replace outdated data on drives. The number of symbols transported during this procedure is defined as update bandwidth. Similar to update problem, when one or more storage nodes or drives malfunction, down or leave a cloud, the system needs to initiate a repair procedure by requesting data from survival nodes to restore the lost data within newly built nodes. The number of symbols shipped for the repair procedure is defined as repair bandwidth. So, update bandwidth and repair bandwidth measure the computation and communication cost of a cloud system for file update and data restore.

Building an efficient cloud storage system with low consumptions of storage, update bandwidth and/or repair bandwidth is an everlasting goal for a cloud storage system designer. Dimakis et al. [

Besides the theoretical studies of [

Papailiopoulos et al. [_{1}, d_{2}, and s_{3} could be restored by accessing d_{1}, s_{1}, c_{2}, s_{2}, c_{3}, and d_{3} from other nodes. An evident advantage of this scheme is the encoding and decoding of a source symbol is confined within one single row. In comparison, recall the scheme of [

Mostly motivated by [

lightweight because only intra-row encoding is permitted. Moreover, it fixes the above problems of [

Our scheme extends

c i j = ∑ h = 1 k l i j h β i h (1)

The generator matrix of the ith MDS encoder, i.e., the ith row in

G i = [ 1 ⋯ 0 l i 11 ⋯ l i ( n − k ) 1 ⋮ ⋱ ⋮ ⋮ ⋮ ⋮ 0 ⋯ 1 l i 1 k ⋯ l i ( n − k ) k ] (2)

The property of MDS requires arbitrary k columns of G_{i} should be independent. The arrangement of code words is illustrated in _{ij} is confined within the ith row. Such a structure is beneficial to users on two facets: On the one hand, it decreases the complexity of encoding and decoding significantly. On the other hand, the file updating of _{ij}, only β_{ij} and (n − k) nonsystematic symbols

{ c i 1 , ⋯ , c i n } on the ith row are to be updated. All other symbols are kept intact. Thus, the update bandwidth for a single source symbol is (n - k + 1) symbols. Moreover, since the coding coefficients are invariant in our scheme, the method of [

Next, consider the repair procedure of

Furthermore, the most frequently happening failures in practice are related to one node, so it is more meaningful to discuss the failures of a single node. Within this category, ones can implement linear operations on the systematic MDS codes in

Λ = { ( β 12 , ⋯ , β m 2 ) , ⋯ , ( β 1 ( k − 1 ) , ⋯ , β m ( k − 1 ) ) } (3)

From node k, a sum symbol θ could be got as below

θ = ∑ i = 1 m β i k (4)

Moreover, symbols in a nonsystematic node can be combined and transformed into a linear function of Λ , θ and { β 11 , ⋯ , β m 1 } . Take the jth nonsystematic node as an example, we can build an equation

∑ i = 1 m α i j c i j = f j ( β 11 , ⋯ , β m 1 , Λ , θ ) (5)

where α_{ij} is a coefficient assigned for c_{ij}, such that { β 1 k , ⋯ , β m k } could be combined into θ. In total, we get a set of (n − k) linear equations for all nonsystematic nodes as below

Ξ = { f 1 ( β 11 , ⋯ , β m 1 , Λ , θ ) , ⋯ , f n − k ( β 11 , ⋯ , β m 1 , Λ , θ ) } (6)

By collecting the symbol of Λ , θ and Ξ , the lost symbols of { β 11 , ⋯ , β m 1 } can be resolved and restored by solving equations in (6) given that m ≤ ( n − k ) . Hence, the repair bandwidth equals 1 + k ( m − 1 ) − 2 m + n symbols. It should be noted that in Ξ , there may exist some linearly dependent equations which do not contribute to the solution, so the coding coefficients {l_{ijh}} should be assigned elaborately to guarantee there are at least m independent equations in Ξ .

Next, consider the failure of a nonsystematic node. For the output of an (n, k)-MDS encoder, i.e., a row in

Finally, an example based on (5, 3) MDS code is given to illustrate the scheme.

Example:

G 1 = ( 1 0 0 1 1 0 1 0 1 2 0 0 1 1 3 ) , G 2 = ( 1 0 0 1 1 0 1 0 1 3 0 0 1 3 4 )

First, apply the method of interference alignment to repair the failure of a systematic node. Take node 1 as an example. In this case, lost symbols are β_{11} and β_{21}. So, Λ = { β 12 , β 22 } , θ = β 13 + β 23 , and

{ f 1 = 3 c 11 + c 21 = 3 β 11 + β 21 + 3 β 12 + β 22 + 3 θ f 2 = 4 c 12 + 3 c 22 = 4 β 11 + 3 β 21 + β 12 + 2 β 22 + 5 θ (7)

With the values of β_{12}, β_{22}, θ, f_{1} and f_{2}, one can restore β_{11} and β_{21} by solving (7).

Next, take node 4 as an example to repair the failure of a nonsystematic node. When node 4 fails, the lost symbols are c_{11} and c_{12}. Exchange the roles of node 4 and node 3 in _{11} and c_{12} move to the systematic part. Thus, using interference alignment, we have Λ = { β 11 , β 21 } , θ = β 12 + β 22 and

{ f 1 = β 13 + 3 β 23 = c 11 + c 21 + 6 β 11 + 6 β 21 + 6 θ f 2 = 2 c 12 + 3 c 22 = 6 c 11 + c 21 + 3 β 11 + 6 β 21 + 5 θ (8)

Performance metrics | [ | (5, 3, 2) code |
---|---|---|

Storage | 15 symbols | 10 symbols |

Code rate | 2/5 | 3/5 |

Repair bandwidth for one node failure | 6 symbols | 5 symbols |

Repair bandwidth for two or more node failures | Nonrepairable | 6 symbols |

Update bandwidth for one symbol updating | 10 symbols | 3 symbols |

With the values of β_{11}, β_{21}, θ, f_{1} and f_{2}, one can restore c_{11} and c_{12} by solving (8). In this example, the repair bandwidth for a single node failure is 5 symbols. Recall that 6 symbols are needed if we use the regular way or the scheme of [

Consider the update procedure of _{11} as an example, only β_{11}, c_{11} and c_{12} need to be updated, so the update bandwidth related to one symbol in this example equals 3 symbols. While, the update bandwidth of [

Last, with this example, a comparison is made between [

With this paper, we propose a network coding based cloud storage scheme. The key points of our scheme include systematic MDS code and none of inter-row reference for encoding. Moreover, the method of interference alignment is utilized to reduce repair bandwidth in the case of single node failures. These techniques bring significant advantages to a cloud storage network with respect to system simplicity, resource consumption, and communication loads, etc. Detailed analysis and an example show that our scheme keeps the simplicity of [

This work is supported in part by NSFC with No. 61471045 and Natural Science Foundation of Liaoning Province with No. 20170540008.

Liu, Y.T. and Morgan, Y. (2018) A Network Coding Based Cloud Storage Scheme. International Journal of Internet and Distributed Systems, 3, 1-8. https://doi.org/10.4236/ijids.2018.31001