How to Store a Secret Salim El Rouayheb Illinois Institute of Technology
A Brief History of Codes for Storage According to Emina 1982 Reed Solomon paper (1960)
What if some nodes cannot be trusted? Adversary (passive for now) controls one node File Key K A Disk 1 K Secret Sharing [Shamir ’79] Eavesdropper user 1 Wiretap channel II A+K . Disk 2 Coset Codes . [Ozarow & Wyner ’84] . Disk 3 A+2K user 4 Disk 4 A+3K (n,k)=(4,2)
Wiretap Network Secure network coding Secret [Cai & Yeung ’02] [ElRouayheb, Soljanin ’07] Coset Code [ElRouayheb, Sprintson, Soljanin ’10] Shares Multicast Network with Main Message There: Wiretapped Edges Separation is optimal Coset code + Network Code
Coset Codes/Secret Sharing are Not Enough failure • Because storage systems are dynamic Disk 1 K • Can we still protect the New disk A+K stored secret? K A+K Disk 2 A+2K • Two surprising results All the data is leaked ! Disk 3 A+2K User Disk 4 A+3K
General Problem Formulation failure • (n,k) system Disk 1 • d: repair degree New disk • α : storage per node β • β : repair bandwidth Disk 2 β • b: nbr of compromised nodes d β • Adversary: passive/active Disk 3 . Pawar, ¡ElRouayheb, ¡Ramchandran, ¡ ’10 ¡ . . User k Disk n What is the largest secret I can store in this system without loosing it or revealing it?
A Divide and Share Scheme 1 2 3 1 4 5 User always sees all the 5 packets 2 4 6 1 2 Eavesdropper always observe 3 packets 3 5 6 3 (n,k,d)=(4,2,3) Rashmi, ¡Shah, ¡Kumar ¡& ¡Ramchandran ¡'09 ¡
Secure Code Random K 1 1 1 2 3 keys K 2 2 1 4 5 3 Secret: K 3 X1 X2 X3 4 X 1 +2K 1 +K 2 +K 3 2 4 6 X 2 +K 1 +2K 2 +K 3 5 6 X 1 +2X 2 +K 1 +K 2 +2K 3 3 5 6 Coset Code
Secure Code in Bandwidth-Limited Regime and d<n-1 (n,k,d)=(7,3,4) Iwan’s Observation
Upper Bound on Secrecy Capacity 1 � d β n+1 � Pawar, ¡ElRouayheb, ¡Ramchandran, ¡ ’10 ¡ k ∑ C ( α , β ) ≤ min{( d − i + 1) β , α } 2 � n+2 � . ¡ ( d − 1) β i = l + 1 . ¡ Previous codes achieve . ¡ n+l � this upper bound for k � bandwidth-limited regime n+l+1 � . ¡ α≥dβ ¡ . ¡ ( d − k + 1) β . ¡ . ¡ . ¡ . ¡ ( d − k + 1) β n+k � n �
General Secure Codes file Coset Regenerating Code Codes Storage System Keys Separation is Optimal for Bandwith- Limited Regime
Surprising result #1: Separation is NOT Optimal a 1 0.5MB n 1 a 2 0.5MB 0.5MB b 1 n 2 b 2 Replacement node New node β =1/3 a 1 a 1 +b 1 a 1 +2a 2 +b 1 +b 2 n 3 a 2 2 a 2 + b 2 2 a 1 +b 1 n 4 a 2 + b 2 (n,k,d)= (4,2,3) α =1 β =1/2 Secret Size=1/2MB Secret Size=2/3MB It may be better not to use all your budgeted bandwidth or storage! Falling back to bandwidth-limited regime codes is always optimal for (n,n-1,n-1) systems Tandon ¡et ¡al. ¡’10 ¡
Finding the Optimal Inner Code is not trivial 0.45 Achievable non- secure 0.4 secure regenerating codes tradeoff 0.35 normalised bandwidth β /M Goparaju, ¡ElRouayheb, ¡ 0.3 Calderbank, ¡ ’ISIT10 ¡ 0.25 0.2 MDS 0.15 Divide & 0.1 Share 0.05 0 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 normalised storage per node α /M (n,k,d)=(7,6,6)
What is the best we can do with a Separation Scheme Black Box (cannot touch) • Simpler design if we want different files with different security requirements • Cloud user: does not have control over the code Theorem: [Goparaju, R., Calderbank, Poor Netcod ’13] Surprising ◆ b ✓ 1 result #2 C ∗ s = ( k − b ) 1 − α n − k
Proof based on Geometry of Repair Spaces (n,k)=(5,3) Data observed by Eve = b=2 compromised b α 1 Data stored on nodes 1’ and 2’ nodes α + 2 dim( S 1 + S 2 ) Data downloaded from node 2 1’ 3 α /8 4 5’ α /4 S 1 +S 2 +S 3 S 1 +S 2 α /2 S 1 5 α user Secure (linear) capacity= k α – amount observed by Eve s ≤ ( k − b ) α C ∗ 2 b Theorem: [Goparaju, R., Calderbank, Poor Netcod ’13] dim( S i 1 + S i 2 + · · · + S i b ) ≥ α 2 + α 2 2 + · · · + α 2 b
A Taste of the Proof… α File:( f 1 , . . . , f k ) f i = ( f i 1 , . . . , f i α ) f 1 1 k k X X p 1 = A i f i , p 2 = B i f i α f 2 2 i =1 i =1 • Node 1’ downloads: S 3 f 3 3 1’ S k+1 S 2 f 2 S 3 f 3 4 p 1 S k+2 S k f k = S k +1 A 1 f 1 + S k +1 A 2 f 2 + · · · + S k +1 A k f k 5 = S k +2 B 1 f 1 + S k +2 B 2 f 2 + · · · + S k +2 B k f k p 2 S k +1 A 1 + S k +2 B 1 = F n q S 2 = S k +1 A 2 = S k +2 B 2 S k = S k +2 A k = S k +1 B k • Analogy to interference alignment • Write these subspace conditions for all failures • Use them to proof theorem by induction
Open Problems 0.45 secure 0.4 1. Storage limited Regime? regenerating codes 2. Storage/Repair Bandwidth tradeoff 0.35 to store a secret of a given size normalised bandwidth β /M 3. Active adversary (omniscient, 0.3 Limited knowledge, … ) 0.25 4. Linear/vs non-linear? 5. Can shared randomness help? 0.2 0.15 we know what to do here 0.1 0.05 0 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 normalised storage per node α /M
QUESTIONS?
Recommend
More recommend