Comparing P2P Systems Anthony D. Joseph John Kubiatowicz CS294-4
Why so many systems? � Many different types of target users � Many different types of environments � Many design choices � Many hazards � Many data types � Many ….
Networks � Chord � Similar interfaces – DHT, DOLR � CAN � Different design goals � Tapestry – Locality, Topology � Pastry – Fault-tolerance � Kademlia � Viceroy � Bamboo � …
Systems We’ve Read About � Freenet � Gia � Publis � OceanStore � SFS � PAST � Bayou � Squirrel � FARSITE � CFS � Logistical Networking � Ivy � Pangaea � PeerDB � Pastiche � PIER
Systems 1 � Freenet � Publis Anon, cens. resistant storage FT, anon, censorship resistant – – storage Objects ref’d by SHA-1 hash – over content (GUID-CHK) Tamper evident, src anon, – updatable, deniable Objs named by GUID-Signed – Subspace Key pointing to Persistent, extensible – CHKs Splits enc key into k shares – Steepest Hill Climbing query – Retrieve k shares for content – routing with TTL Static mapping of share – Space allocated by popularity – locations to servers Power-law node degrees – Indirection-based (file) update – Tolerates up to 30% failure mechanism vulnerable to – server compromise
Systems 2 � SFS � Bayou Auth, secure, encrypted Replicated P2P DB – – client-server storage and � Atomic operations access control � Whole DB replication ACL-based auth of – Operation-based updates – individuals, groups, and Tentative local commits – groups of groups enforced by primary global Caching for speed and – commit availability � Apps ctl data view Gossip-based info – propagation Merge procedures for per- – write conflict resolution
Systems 3 � FARSITE � Logistical Networking P2P storage Network storage layer – – Max size ~10 5 IBP: unreliable, transient – – byte-arrays on depots Large-scale read-only – sharing, small-scale Aggregation into exNodes – read/write-sharing � Can implement arbitrary reliability mechanisms � Complex lease mechanism � Analog to Unix inodes Assumes user auth infra – Byzantine ring formed for – each namespace Reliability and availability – through whole file replication
Systems 4 � Pangaea � Pastiche – Server-based replication – P2P data replication for whole machine backup – Assumes trusted servers – Built on Pastry – Two-levels of servers: – Enc storage of � Gold immutable chunked data – Fully connected clique – Network distance or – Strong maintenance coverage based buddy � Bronze choices – Limited connectivity – Last writer wins conflict resolution
Systems 5 � Gia � OceanStore Modified Gnutella protocol Wide-area CS/P2P – – replicated, robust, secure, Argues against DHTs for – auth data storage this search type Built on Tapestry, Bamboo – � Transient P2P clients Byzantine update commit � Keyword-based searches – � Searching for hay instead Per-write conflict resolution – of needles Erasure coding based – Capacity-based topology – replication (robustness) adaptation with block caching Flow-ctrl for queries (performance) –
Systems 6 � PAST � Squirrel – P2P archival storage – Decentralized P2P web model caching � No updates – Homestore model: stores content at home and � Whole-file storage client nodes – Tries to balance per- node storage load – Directory model: use (assumes = 100x diff) recent clients – Replica and file diversion to maintain k copies
Systems 7 � CFS � Ivy P2P file storage R/W P2P file storage – – � Lease-based Log-based, built on DHASH – � Read-only for clients Snapshot and view-based – � Publishers can update approach � No explicit delete User control over – Built on Chord consistency/serialization – Storage load-balancing – Provably efficient and – robust Built on DHASH xface – � File split into blocks � k replication
Systems 8 � PeerDB � PIER – On path to a P2P DB – P2P DB � Built on CAN and – No global schema others – Incomplete replication – Relaxed consistency – Dynamic reconfiguration – Scalable with – Requires small subset of namespace model persistent servers – Std schemas – Several join schemas
Evaluation Metrics � Commit model (e.g., primary, � Reliability / robustness (i.e., group, all) data that is eventually available) Information propagation model � (e.g., flood, epidemic, multicast) � Availabililty (i.e., data that is always available) � Topology Quality of service � � Search model (e.g., targeted, flood, epidemic, multicast) Anonymity/privacy � - Expressiveness Censorship-resistance � � - Information Publisher/Server deniability � � placement/autonomy File integrity � � Scaleability File authenticity � � Target user?
Metrics from class Maintainability / Manageability Trust model (physical vs virtual) � � Authentication � Topology – Authorization Roles: client, supernode, server – – Admission control Defense against selfish/ malicious – � Integrity behaviors – Node heterogeneity � Denial of svc resilience – Function, capabilities, � Scope of knowledge – ownership, dynamic � Needle vs Hay election/configuration False negatives – Indirection between obj lookup � Static resilience vs MTTR � and routing � Performance under churn Application semantics used in � � Emergent behaviors routing Non-data services Data type / structured data � � GRID computing –
Recommend
More recommend