11/24/2009 Outline Dynamic Authenticated Index Structures ☺ The Model for Outsourced Databases ☺ Motivation ☺ Problem ☺ Solution ☺ Background Feifei Li, Marios Hadjieleftheriou, George Kollios, Leonid Reyzin Boston University ☺ Papers contributions AT&T Labs ‐ Research ☺ Experimental validation Presenter : Nima Najafian 1 Outsourced Database Model Motivation Owner: publish data Servers: host the data and provide query services • Advantages Clients: query the owner’s data through servers – The data owner does not need the hardware / software / personnel to run a DBMS – The ownerachieves economies of scale The ownerachieves economies of scale – The client enjoys better quality of service SD � A main challenge – The service provider is not trusted, and may return incorrect query results clients servers owner 3 Problem Un ‐ trusted server ☺ Un ‐ trusted Servers • Lazy: incentives to perform less • Curious: incentives to acquire information • Malicious – Incorrect results ( could be bugs) – Possibly compromised 1
11/24/2009 Problem 1: Injection Problem 2: Drop Select * from T where 5< A< 11 Select * from T where 5< A< 11 client client owner owner SD SD Returns Returns 7 7, 8 , 9 7, 8 , 9 A B A B A B A B r 1 … r 1 … r 1 … r 1 … … … … … … … … … r i-1 4 r i-1 4 r i-1 4 r i-1 4 r i 7 r i 7 r i 7 r i 7 r i+1 9 r i+1 9 r i+1 r i+1 9 9 r i+1 9 r i+2 11 r i+2 11 r i+2 11 r i+2 11 server server 7 8 Solution Query Authentication: (the dimensions) ☺ The Model • Query Correctness ☺ Motivation results do exist in the owner's database ~ injection ☺ Problem • Query Completeness ☺ Ability to authenticate without trusting the server no records have been omitted from the result ~ drop y g • Query Freshness ★ ★ h (Query Authentication) results are based on the most current version of the database ( this will bring a third problem into the picture ) ~omission 10 Background General Approach Authenticated Structures ☺ Cryptographic essentials Verification Object (VO) A B r 1 … … … r i-1 4 r i 7 Query results SD clients servers owner 11 2
11/24/2009 1: Collision ‐ resistant hash functions 2: Public key digital signature schemes • It is computational hard to find x 1 and x 2 s.t. h(x 1 )=h(x 2 ) Sender • Computational hard? Based on well established m assumptions such as discrete logarithms Insecure Channel • SHA1 Recipient Recipient SHA1 KeyGen → (SK, PK) • Observations: σ SK – variable input size � 20 bytes σ Ver(m, PK, σ ) → valid? m – Computation cost: 2 ‐ 3 μ s (for up to 500 bytes input) Sign(m, SK) → σ – Storage cost: 20 bytes – Under Crypto++ [crypto] and OpenSSL [openssl] 14 13 4: Merkle Hash Tree [M89] - Amortizing Signature Cost 2: Public Key Digital Signature Schemes Collision resistant hash function � any change in the Digital signature of the root � no one except the owner Single signature to sign many messages • Formally defined by [GMR88] Hash function is publicly known tree will lead to a different hash value for the root could produce the signature – The message has not been changed in any way σ Ver(h 1..8 , σ , pK)= valid? – Sign(h 1..8 ,SK) The message is indeed from the sender (corresponding to the public key) σ – No one except the secret key owner could produce a signature h 1..8 h 1..8 • One such scheme: RSA [RSA78] • Observations h 1..4 h 1..4 h 5..8 h 5..8 – Computation cost: about 3 ‐ 4 ms for signing and more than 100 μ s for – Computation cost: about 3 ‐ 4 ms for signing and more than 100 μ s for verifying h 12 = h 12 h 34 h 56 h 56 h 78 h 78 – Storage cost: 128 bytes H(h 1 | h 2 ) 3: Signature Aggregation (Condensed RSA) h 1 h 2 h 3 h 4 h 5 h 5 h 6 h 6 h 7 h 8 – Checking one aggregated signature is almost as fast as an individual signature m 1 m 2 m 3 m 4 m 5 m 5 m 6 m 6 m 7 m 8 15 16 Contributions Correctness and Completeness ☺ Proposed authenticated structures • Correctness, Completeness: � Getting to know B+ trees – Any change in the tree will lead to different hash � The idea of changing – Relative position of values is authenticated � ASB Tree ( based on existing work) � ASB Tree ( based on existing work) • Authentication: A th ti ti � MB tree ( based on existing work) – Signing the root with SK � EMB tree � Freshness (third dimension of query Authentication) 17 3
11/24/2009 B+ ‐ Tree Structure B+ ‐ Tree File Organization In a B+ ‐ Tree file organization, the leaf nodes • A typical node contains up to n – 1 search key values of the tree stores the actual record rather than storing pointers to records. K1, K2,…, Kn ‐ 1, and n pointers P1, P2,…, Pn. The search key values are kept in sorted order. • The pointer Pi can point to either a file record or a bucket of pointers which each point to a file record. P1 K1 P2 … Pn-1 Kn-1 Pn 19 20 Range Authentication – A Simple Approach Signature ‐ Based Approach: ASB Tree based on [PJR05] correctness but B+ Tree Produced by the owner NOT NOT = completeness !!! sig h r ( ) Sent to the client Sent to the client i i i i along with 3 r r r r , , , 4 5 6 S(r 1 |r 2 ) S(r 2 |r 3 ) … … S( n ‐ 2 |r n ‐ 1 ) S(r n ‐ 1 |r n ) 1. order database tuples w.r.t query attribute sig sig sig sig 3 4 5 6 2. sign consecutive pairs 3. build B+ tree on top of it r r r r r r r r r r r r r r r r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 4. return tuples [a ‐ 1, b+1] together with signatures in [a ‐ 1, b]. (query is [a, b]) (a, b here are index) Q 5. verify any two consecutive pairs 22 Comparing Cryptographic OP Condensed RSA (NDSS’04) • Server: – Selects records matching posed query • one hashing takes 2 ‐ 3 μ s – Multiplies corresponding RSA signatures – Modular Multiplication ‐ 100 times slower – Returns single signature to querier – Verifying ‐ 1000 times slower – Signing ‐ 10000 times slower Signing 10000 times slower S Server Q Querier i Given t record signatures: Given t messages: { σ 1 , σ 2 … σ t } , {m 1 ,m 2 … m t } and σ 1,t σ 1,t compute combined signature verify combined signature: t Hashing <t mod_M <t ver <t Sign σ 1,t = Π σ i mod n ( σ 1,t ) e = ? = Π h(m i ) (mod n) Send σ 1,t to the querier N is RSA modulus of the public key from the owner 23 24 4
11/24/2009 Signature Chaining Issues Reduce S/C communication Cost • A heavy burden on the owner to produce the • Aggregation Signature: Condensed RSA signatures • Overhead on the client to verify the aggregated m 1 m k m 1 m k signature σ 1 σ k σ • Storage overhead at the server to store the σ = combine( σ 1 ,… , σ k ) signatures (which potentially leads to higher computational cost to retrieve them) Overhead: computation cost of modular multiplication with big modular base number, • High communication overhead on both the server close to 100 μ s and the owner, in order to exchange the signatures 25 Merkle B(MB) Tree: Natural Extension for Merkle B(MB) Tree: Natural Extension for Range Query Range Query • Use a B + ‐ tree instead of a binary search tree: … p 0 h 0 p 1 k 1 h 1 h 1 p f k f h f 410 720 … 250 250 320 320 410 410 600 600 720 720 h 1 = H(h 10 | … | h 1f ) p 10 h 10 h 10 p 11 k 11 h 11 h 11 t 1 t 2 t 3 t 4 t 5 � Extend it with hash information: For root node, σ = Sign(h 0 | … | h f ) leaf node … … K i h i =H(t i ) K j h j =H(t j ) 27 28 Extends to Range Query: f=2 (f is the Client Side Verification fanout) Ver(h 1..8 ,PK, σ ) Select * from T where 5< A< 11 Sign(h 1..8 ,SK) Select * from T where 5< A< 11 σ σ VO: 5, 12, h 1..4 , σ Valid? h 1..8 h 1..8 Query results: 6, 9 h 1..4 h 1..4 h 5..8 h 5..8 h 1..4 h 5..8 h 12 h 34 h 56 h 78 h 56 h 78 Unknown to the client h 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 5 h 6 h 7 h 8 1 2 3 4 5 5 6 9 12 12 5 6 9 12 VO: 5, 12, h 1..4 , σ Reconstruct query LB(q) RB(q) q q subtree 29 30 5
11/24/2009 Embedded Merkle B (EMB) tree: A fractal structure Query Example: f=5 tuple 5, 10, hash of 1, 3, 12, 14, 16, VO: p 0 h 0 p 1 k 1 h 1 … p f k f h f hash of entry 20, 29, 42 8 hashes 10 20 20 29 29 42 42 LB(q) 1 1 3 3 5 5 6 9 10 10 12 12 14 14 16 16 p 10 h 10 p 11 k 11 h 11 p 1f k 1f h 1f … q RB(q) 20 22 23 25 … … … … A MB tree with fanout f e built on this node 31 32 EMB tree Analysis Query Example: f=5 VO: tuple 5, 10, hash of red circle node, • We can show that: hash of red circle nodes(2), hash of red circle nodes(2), – Query cost is as a MB tree with fanout f k 5 hashes – Authentication cost (c/s comm. cost and client ( / 10 20 29 42 verification cost) is as a MB tree with fanout f e , 10 12 14 16 10 20 29 42 intuition: LB(q) 1 3 5 6 9 – f k is smaller than a normal MB tree given a page size P 1 3 5 5 6 9 10 10 12 14 16 q RB(q) 20 22 23 25 … … … … 34 33 EMB tree’s variants Freshness? • Don’t store the embedded tree, build it on the fly – emm, it’s correct! ☺ Owner EMB ‐ tree Client – Fanout f k is as a normal MB tree, better query performance, better storage performance performance better storage performance query query update update Server � Use multi ‐ way search tree instead of B + tree as q+VO embedded tree – EMB * tree � Hash path in the embedded tree could stop in index level, not necessary to go to the leaf level, hence reduce the VO size new signature(s): Return VO constructed based σ v on previous version: σ v ‐ 1 (s) 35 36 6
Recommend
More recommend