outline
play

Outline PAST goals Storage management and caching PAST api in - PowerPoint PPT Presentation

Outline PAST goals Storage management and caching PAST api in PAST File storage overview File and replica diversion Antony Rowstron and Peter Druschel Replica management Presented to cs294-4 by Owen Cooper Caching


  1. Outline • PAST goals Storage management and caching • PAST api in PAST • File storage overview • File and replica diversion Antony Rowstron and Peter Druschel • Replica management Presented to cs294-4 by Owen Cooper • Caching • Performance • Discussion PAST (non)goals Security Model • P2P global storage network • Pastry node ids are a hash of a public key – Use properties of existing p2p systems (Pastry) • Smartcard based security – Support for strong persistence – Provides keys • Via a core set of replicas – High availability – Quota management • Via local caching • Nodeid and fileid generation controlled – Scalable – Try to stop nodes from getting consecutive ids • Obtain high storage utilization via local cooperation – Or clients from overloading parts of the network – Secure • Design goals do not include • But node id and real world identity may not be – Replacing the file system linked – Updatable files • Data not encrypted – Directory or lookup service

  2. PAST API’s File insertion • In PAST, files are immutable • Insert(name, c, k, file) • Fileid=Insert(filename,credentials, k, file) – Computes a storage certificate • Contains fileid, hash of content, k, salt – Insert k copies of the file into the network, or fail. – Deducts k*filesize from quota – Fileid a signed (filename, credentials, salt) – Routes file and storage certificate using pastry using – Successful if ack with receipts from k nodes fileid. • File=lookup(fileid) – Node verifies the integrity of the file, stores it, and asks – Return a copy of the file if it exists k-1 closest nodes to store the file. • Reclaim(fileid, cradentials) • K-1 nodes in leaf set (k-1 <= l) – Node returns ack with k signed storage receipts, or a – Reclaim accepted if requested by the owner nak. – Allows, but does not require, storage reclamation Lookup and Reclamation Diversion • Pastry ensures replica is found • A file or replica can be relocated • For a replica, to another close node – Since a lookup is routed to the closest nodeid – If one of the K closest is overloaded • Reclamation • For a file, to another set of nodes in the idspace – Client generates a reclaim certificate – If the nodes around a fileid are (possibly locally) congested – Sends it to the fileid via pastry • Why is this necessary? – Recipients verify the certificate & issue receipt – Differing storage capacity at nodes – Client reclaims quota – Differing file size for inserted files

  3. Replica Diversion File Diversion • Node responsible for fileid asks k-1 neighbors to • Replica diversion is local store the file – Allows storage choice between nodes around • Neighbor (N) may divert a copy to a node in its fileid leaf set • File Diversion – Pointer to copy inserted at N – Triggered when an insert with a fileid fails – N issues storage certificate – Insert is tried a total of three times – N also inserts a pointer on the k+1th closest node – New fileid generated by changing the salt • No orphan if N fails • N remains responsible for pointer maintenance Storage Policy Replica maintenance • How does a node choose to accept or reject a • Node join/leave causes responsibility shift replica? – Pastry node failure detection will cause leaf set updates – Computes sizeof(file)/sizeof(free_space) • Past detects responsibility shifts this way • Newly responsible node must copy files – Compares to T pri or T div depending node’s role – T pri > T div – Make a copy immediately, OR • How is node chosen for replica diversion – pointer to old owner & copy lazily • Diverted replicas – Search leaf set for the node that • Has maximal free space – Target of diversion may move out of leaf set • Doesn’t already hold a diverted or primary replica • Node to store repica can be any one in leaf set • File diversion – Must exchange keepalive messages themselves – K copies cannot be located (via primary or diversion) – Should be relocated

  4. Replica maintenance (2) Caching • Pastry’s locality based routing will tend to direct • Node failure may cause storage shortage requests to nearby copies – No node in leaf set can take over ownership • PAST also stores cached copies • Search space is widened – Along routing path between client and fileid – Ask most extreme nodes to locate storage – For insert and lookup operations – Cache maintained using GD-size algorithm • Increases search space to 2l nodes • Weight per file: 1/size(file) – If no storage space found, fail. • Eviction: – Pick file with minimum weight – Subtract weight of evicted file from all others Experiments: without diversion Experiments (2): with diversion • Experiments use – Large trace from web server – Files from local web server • The case for diversion with web trace – Without diversion: • 51.1% of insertions failed • 60.8% storage utilization • With diversion – Bigger leaf set size a plus

  5. Experiments (3):varying T pri Experiments (4): Varying T div • Varying T div • Effects of varying T pri • T pri is constant • # files stored v.s. size of file File and Replica Diversion caching • 8 traces combined • Requests from clients in each trace are mapped to close PAST nodes

Recommend


More recommend