a p2p dropbox mafintosh 8 person team based in 5
play

A P2P Dropbox @mafintosh 8 person team Based in 5 countries - PowerPoint PPT Presentation

A P2P Dropbox @mafintosh 8 person team Based in 5 countries >1500 npm modules >1500 npm modules (~0.5% of npm) We make tools that help scientists share data We make tools that help scientists share data (and other people as


  1. A P2P Dropbox

  2. @mafintosh

  3. 8 person team

  4. Based in 5 countries

  5. >1500 npm modules

  6. >1500 npm modules (~0.5% of npm)

  7. We make tools that help scientists 
 share data

  8. We make tools that help scientists 
 share data (and other people as well)

  9. Data === Files

  10. Existing great file sharing tools

  11. • Extremely easy to use • Centralised / High cost • Who owns the data? • Sustainable?

  12. • Decentralised / P2P • Massive adopted / Simple protocol • Only works for static files • Scales worse on really big data sets • No diffs

  13. We can do better

  14. • Easy to use, but not centralised like Dropbox • Decentralised / P2P but not for piracy like BitTorrent • Build for modern use cases

  15. • Easy to use, but not centralised like Dropbox • Decentralised / P2P but not for piracy like BitTorrent • Build for modern (scientific) use cases

  16. A next generation file sharing tool

  17. Real time / Live data (get only the data you need and get updates when it changes)

  18. Decentralised (no servers / data centers needed, actually serverless)

  19. Diffable (sharing two similar data sets should only share the diff)

  20. npm install -g dat

  21. Append only logs

  22. Append only logs (a list of data you only ever append to, get it? )

  23. Append only logs lists (a list of data you only ever append to, get it? )

  24. (Append item to list) Data item #0

  25. Data item #0 (Append item to list) Data item #1

  26. Data item #0 Data item #1 (Append item to list) Data item #2

  27. Why “Append Only Logs”?

  28. • A simple data structure • Immutable • Logical ordering • Easy to digest / index

  29. How can we share append only logs?

  30. How can we share append only logs? (over a p2p network where we don’t trust other people)

  31. Merkle Trees

  32. Merkle Trees (a tree structure that verifies data)

  33. Merkle Trees (a tree structure that verifies data) (unrelated to Angela Merkel)

  34. Merkle Trees (a tree structure that verifies data) (unrelated to Angela Merkel)

  35. Data #0

  36. Root hash #0 Hash #0 Data #0

  37. Root hash #1 Hash #1 Hash #0 Hash #2 Data #0 Data #1

  38. Root hash #2 Hash #1 Hash #0 Hash #2 Hash #4 Data #0 Data #1 Data #2

  39. Root hash #3 Hash #3 Hash #1 Hash #5 Hash #0 Hash #2 Hash #4 Hash #6 Data #0 Data #1 Data #2 Data #3

  40. Root hash #3 verifies all the data

  41. 👪 wants to share data with  Data #2

  42. Root hash #3  trust this hash Hash #3 Hash #1 Hash #5 Hash #0 Hash #2 Hash #4 Hash #6 Data #0 Data #1 Data #2 Data #3 👪 wants to share this

  43. Root hash #3  trust this hash Hash #1 Hash #6 👪 needs to share these Data #2

  44. Root hash #3 Hash #1 Hash #4 Hash #6 Data #2

  45. Root hash #3 Hash #1 Hash #5 Hash #4 Hash #6 Data #2

  46. Root hash #3 Hash #3 Hash #1 Hash #5 Hash #4 Hash #6 Data #2

  47.  checks that match Hash #3 Root hash #3

  48. 👪 only needs to send O(log(n)) hashes to 

  49. 👪 only needs to send O(log(n)) hashes to 

  50. 👪 only needs to send O(log(n)) hashes to  (can easily be optimised to never send the same hash twice)

  51. 👪 only needs to send O(log(n)) hashes to  (can easily be optimised to never send the same hash twice) (come ask me later, i’m fun at parties)

  52. Real time

  53. Every time we append data root hash changes Root hash

  54. Crypto to the rescue

  55. Generate a key pair Secret Key + Public Key

  56.  trusts ……. Public Key

  57. Secret Key 👪 signs the root Root hash #2 Hash #1 Hash #0 Hash #2 Hash #4 Data #0 Data #1 Data #2

  58. Secret Key Root hash #3 👪 signs the new root Hash #3 Hash #1 Hash #5 Hash #0 Hash #2 Hash #4 Hash #6 Data #0 Data #1 Data #2 Data #3

  59.  uses to verify signatures Public Key Root hash

  60. npm install hypercore

  61. (demo)

  62. How do we turn append only logs into a file sharing tool?

  63. Take a file ~/cool.data

  64. Cut it into pieces ~/cool.data

  65. Insert each piece into the log Data #0 Data #1 Data #2 ~/cool.data Data #3 Data #4

  66. Diffable

  67. Divide a file into chunks that are unlikely to change when the file is updated

  68. Example: git

  69. function hello () { var world = 'world' console.log('hello', world) }

  70. function hello () { var world = 'world' console.log('hello', world) } (One line per chunk)

  71. function hello () { var world = 'universe' console.log('hello', world) } (Edit one line)

  72. function hello () { var world = 'universe' console.log('hello', world) } (3/4 chunks unchanged)

  73. Only works for text files

  74. Rabin fingerprinting (Content defined chunking)

  75. Scans through the file and creates chunks based on the actual file content

  76. (A new part is inserted in the middle of the file)

  77. (Only the neighbouring chunks are changed)

  78. npm install rabin

  79. Each Rabin chunk is an entry in our append only log

  80. Data #0 Data #1 Data #2 …

  81. Merkle trees + Rabin = ❤

  82. Hash #3 Hash #1 Hash #5 Hash #0 Hash #2 Hash #4 Hash #6 Data #0 Data #1 Data #2 Data #3

  83. Hash #3 Hash #1 Hash #5 Hash #0 Hash #2 Hash #4 Hash #6 Data #0 * Data #1 Data #2 Data #3 Change some data

  84. Hash #3 Hash #1 Hash #5 Hash #0 Hash #2 Hash #4 Hash #6 Data #0 * Data #1 Data #2 Data #3 Change some data Rabin makes sure these entries do not change

  85. Only a few hashes change * Hash #3 * Hash #1 Hash #5 Hash #0 * Hash #2 Hash #4 Hash #6 Data #0 * Data #1 Data #2 Data #3 Change some data

  86. Keep an index Hash Data Data … Data

  87. See the same hash twice, just copy the data Hash Data

  88. See the same hash twice, just copy the data Hash Data (no need to re-download it)

  89. See the same hash twice, just copy the data Hash Data (no need to re-download it) (can be … easily … optimised for space)

  90. npm install hyperdrive

  91. (demo)

  92. is a cli tool and desktop app that manages hyperdrives

  93. (demo)

  94. Great apps build on

  95. Beaker browser https://github.com/beakerbrowser/beaker

  96. Science Fair https://github.com/codeforscience/sciencefair

Recommend


More recommend