0100100111011001 Load it up! 0001000101100011 1001110100110111 1110001010010110 Root 0011000000000100 Host 0 0 1 4 Thursday, November 11, 2010 65
0100100111011001 Split 0001000101100011 1001110100110111 1110001010010110 Root 0011000000000100 Host 0110000111101100 0100000001101011 0010111000000001 0 0011000101111000 3 4 7 1 1 2 3 Thursday, November 11, 2010 66
0100100111011001 ’01...’ 0001000101100011 1001110100110111 1110001010010110 Root 0011000000000100 Host 0110000111101100 0100000001101011 0010111000000001 0 0011000101111000 3 4 7 1 1 2 3 Thursday, November 11, 2010 67
0100100111011001 ‘0...’ 0001000101100011 1001110100110111 1110001010010110 Root 0011000000000100 Host 0110000111101100 0100000001101011 0010111000000001 0 0011000101111000 3 4 7 1 1 2 3 Thursday, November 11, 2010 68
0100100111011001 ‘1...’ 0001000101100011 1001110100110111 1110001010010110 Root 0011000000000100 Host 0110000111101100 0100000001101011 0010111000000001 0 0011000101111000 3 4 7 1 1 2 3 Thursday, November 11, 2010 69
Split Root Host 0001000101100011 0011000000000100 0010111000000001 0 3 4 7 0011000101111000 1 1 0100100111011001 2 3 0110000111101100 1001110100110111 0100000001101011 1110001010010110 Thursday, November 11, 2010 70
Find: 0100100111011001... Find: 0100100111011001... Find: 0100100111011001... 011 Root Root 010 010 Host Host 000 001 010 011 001 000 001 010 010 011 000 000 001 010 011 000 001 010 011 Thursday, November 11, 2010 71
Find: 0100110111011... Thursday, November 11, 2010 72
011 Find: 0100110111011... Find: 0100110111011... Find: 0100110111011... Root Root 010 010 Host Host 000 001 010 011 001 000 001 010 011 011 000 000 001 010 011 000 001 010 011 Thursday, November 11, 2010 73
Each host stores: All the data that “leaf” there. The list of parent nodes talking to it. The list of children it knows about. Thursday, November 11, 2010 74
Dynamically Adjusting: Data hashes in “clumps” making some hosts under-full and some hosts over-full. Host running out of storage? Split in two. Give half the data to another node. Host running out of bandwidth? Clone data and load-balance. Thursday, November 11, 2010 75
011 Root Root Root 010 Root Host Host Host Host 000 001 010 011 001 001 001 001 000 001 001 001 010 011 001 000 000 000 000 000 000 000 001 010 011 000 001 010 011 010 010 Thursday, November 11, 2010 76
Real DHTs in action Peer 2 Peer file-sharing networks. Content Delivery Networks (CDNs like Akamai) Cooperative Caches Thursday, November 11, 2010 77
Distributed Hash Tables (DHTs) Thursday, November 11, 2010 78
Key/Value Stores Thursday, November 11, 2010 79
Some common Key/Value Stores “NoSQL” CouchDB MongoDB Apache Cassandra Terrastore Google Bigtable Thursday, November 11, 2010 80
Name Email Address 1515 Main Tom Limoncelli tlim@google.com Street Mary Smith mary@example.com 111 One Street Joe Bond joe@007.com 7 Seventh St Thursday, November 11, 2010 81
Name Email Address Tom Limoncelli tlim@google.com 1515 Main Street Mary Smith mary@example.com 111 One Street User Transaction Amount Joe Bond joe@007.com 7 Seventh St Tom Limoncelli Deposit 100 Mary Smith Deposit 200 Tom Limoncelli Withdraw 50 Thursday, November 11, 2010 82
Id Name Email Address Tom 1515 Main 1 tlim@google.com Limoncelli Street mary@example.c 111 One 2 Mary Smith om Street User Id Transaction Amount 3 Joe Bond joe@007.com 7 Seventh St 1 Deposit 100 2 Deposit 200 1 Withdraw 50 Thursday, November 11, 2010 83
Id Name Email Address Tom 1515 Main 1 tlim@google.com Limoncelli Street mary@example.c 111 One 2 Mary Bond om Street User Id Transaction Amount 3 Joe Bond joe@007.com 7 Seventh St 1 Deposit 100 2 Deposit 200 3 Withdraw 50 Thursday, November 11, 2010 84
Relational Databases 1st Normal Form 2nd Normal Form 3rd Normal Form ACID: Atomicity, Consistency, Isolation, Durability Thursday, November 11, 2010 85
Key/Value Stores Keys Values BASE: Basically Available, Soft-state, Eventually consistent Thursday, November 11, 2010 86
Eventually? Who cares! This is the web, not payroll! Change the address listed in your profile. Might not propagate to Europe for 15 minutes. Can you fly to Europe in less than 15 minutes? And if you could, would you care? Thursday, November 11, 2010 87
Key/Value example: Key Value tlim@google.com BLOB OF DATA mary@example.com BLOB OF DATA joe@007.com BLOB OF DATA Thursday, November 11, 2010 88
Key/Value example: Key Value { ‘name’: ‘Tom Limoncelli’, tlim@google.com ‘address’: ‘1515 Main Street’ } { ‘name’: ‘Mary Smith’, mary@example.com ‘address’: ‘111 One Street’ } { ‘name’: ‘Joe Bond’, joe@007.com ‘address’: ‘7 Seventh St’ } Thursday, November 11, 2010 89
Google Protobuf: http://code.google.com/p/protobuf/ Key Value message Person { " required string name = 1; tlim@google.com " optional string address = 2; repeated string phone = 3; } { ‘name’: ‘Mary Smith’, mary@example.com ‘address’: ‘111 One Street’, ‘phone’: [‘201-555-3456’, ‘908-444-1111’] } { ‘name’: ‘Joe Bond’, joe@007.com ‘phone’: [‘862-555-9876’] } Thursday, November 11, 2010 90
Key/Value Stores Thursday, November 11, 2010 91
Bigtable Thursday, November 11, 2010 92
Bigtable Google’s very very large database. OSDI'06 http://labs.google.com/papers/bigtable.html Petabytes of data across thousands of commodity servers. Web indexing, Google Earth, and Google Finance Thursday, November 11, 2010 93
Bigtable Keys Can be very huge. Don’t have to have a value! (i.e the value is “null”) Query by Key Key start/stop range (lexigraphical order) Thursday, November 11, 2010 94
Long keys are cool. Key Value Query range: Main St/123/Apt1 Jones Start: “Main St/123” End: infinity Main St/123/Apt2 Smith Main St/200 Olson Thursday, November 11, 2010 95
Bigtable Values Values can be huge. Gigabytes. Multiple values per key, grouped in “families”: “key:family:family:family:...” Thursday, November 11, 2010 96
Families Within a family: Sub-keys that link to data. Sub-keys are dynamic: no need to pre-define. Sub-keys can be repeated. Thursday, November 11, 2010 97
Example: Crawl the web For every URL: Store the HTML at that location. Store a list of which URLs link to that URL. Store the “anchor text” those sites used. <a href=”URL”>ANCHOR TEXT</a> Thursday, November 11, 2010 98
http://www.cnn.com <html>.........</html> http://tomontime.com <html> <p>As you may have read on <a href=”http:// www.cnn.com”>my favorite news site</a> there is... Thursday, November 11, 2010 99
Family Another family Key contents: anchor:tomontime.com anchor:cnnsi.com com.cnn.www <html>... my favorite news site CNN Key contents: anchor:everythingsysadmin.com com.tomontime <html>... videos Thursday, November 11, 2010 100
Recommend
More recommend