file syst em design f or and
play

File Syst em Design f or and I nt roduct ion NSF File Ser ver - PDF document

File Syst em Design f or and I nt roduct ion NSF File Ser ver Appliance An appliance is a device designed t o Dave Hit z, J ames Lau, and perf orm a specif ic f unct ion Michael Malcolm Net working t rend has been t o use appliances


  1. File Syst em Design f or and I nt roduct ion NSF File Ser ver Appliance • An appliance is a device designed t o Dave Hit z, J ames Lau, and perf orm a specif ic f unct ion Michael Malcolm • Net working t rend has been t o use appliances inst ead of general purpose Technical Report TR3002 comput ers. Examples: NetApp – rout ers f rom Cisco and Avici 2002 – net work t er minals http:/ / www.netapp.com/ tech_library/ 3002.html – net work print ers • New t ype of net work appliance is an NFS (At WPI : ht t p: / / www. wpi.edu/ Academics/ CCC/ Help/ Unix/ snapshot s. ht ml) f ile server I nt roduct ion : NFS I nt roduct ion : WAFL • NFS File Server Appliance f ile syst ems • WAFL has 4 r equir ement s – Fast NFS service have dif f erent requirement s t han t hose f or a general purpose f ile syst em – Support large f ile syst ems (10s of GB) t hat can grow (can add disks) – NFS access pat t erns dif f erent t han local – P rovide high perf ormance writ es and support RAI D f ile access pat t erns – Rest art quickly, even af t er unclean shut down • Net work Appliance Corporat ion uses Writ e • NFS and RAI D bot h st r ain wr it e per f or mance: Anywhere File Layout (WAFL) – NFS server must respond t hat dat a is writ t en – RAI D must writ e parit y bit s also Out line I nt roduct ion t o Snapshot s • I nt roduct ion • WAFL’sclaim t o f ame (done) • WAFL cr eat es and delet es aut omat ically at pr eset • Snapshot s : User Level (next ) t imes • WAFL I mplement at ion – Up t o 255 at once • Snapshot s: Syst em Level • Copy-on-wr it e t o avoid duplicat ing blocks in t he • Per f or mance act ive f ile syst em • Uses: • Conclusions – Users can recover f iles – Sys admins can creat e backups f rom running syst em – Rest art quickly af t er unclean shut down 1

  2. Snapshot Administ rat ion User Access t o Snapshot s • The WAFL ser ver allows commands f or sys admins • Suppose accident ally r emoved f ile named “ todo ”: t o cr eat e and delet e snapshot s, but t ypically done aut omat ically • WPI , snapshot s of / home: spike% ls - lut .snapshot/*/todo -rw-r--r-- 1 hitz 52880 Oct 15 00:00 .snapshot/nightly.0/todo – 7:00 AM, 10:00, 1:00, 4:00, 7:00, 10:00, 1:00 AM -rw-r--r-- 1 hitz 52880 Oct 14 19:00 .snapshot/hourly.0/todo -rw-r--r-- 1 hitz 52829 Oct 14 15:00 .snapshot/hourly.1/todo – Night ly snapshot at midnight every day -rw-r--r-- 1 hitz 55059 Oct 10 00:00 .snapshot/nightly.4/todo – Weekly snapshot is made on Sunday at midnight every -rw-r--r-- 1 hitz 55059 Oct 9 00:00 .snapshot/nightly.5/todo week • Thus, always have: 7 hourly, 7 daily snapshot s, 2 weekly • Can t hen r ecover most r ecent ver sion: snapshot s spike% cp .snapshot/hourly.0/todo todo claypool 32 ccc3=>>pwd /home/claypool/.snapshot • Not e, snapshot dir ect or ies (.snapshot ) are claypool 33 ccc3=>>ls hourly.0/ hourly.3/ hourly.6/ nightly.2/ nightly.5/ weekly.1/ hidden in t hat t hey don’t show up wit h ls hourly.1/ hourly.4/ nightly.0/ nightly.3/ nightly.6/ hourly.2/ hourly.5/ nightly.1/ nightly.4/ weekly.0/ Out line WAFL File Descript ors • I nt roduct ion • I node based syst em wit h 4 KB blocks (done) • I node has 16 point er s • Snapshot s : User Level (done) • For f iles smaller t han 64 KB: • WAFL I mplement at ion (next ) – Each point er point s t o dat a block • Snapshot s: Syst em Level • For f iles larger t han 64 KB: • Per f or mance – Each point er point s t o indirect block • Conclusions • For really large f iles: – Each point er point s t o doubly-indirect block • For ver y small f iles, dat a kept in inode inst ead of point er s Zoom of WAFL Met a-Dat a WAFL Met a-Dat a (Tr ee of Blocks) • Root inode must be in f ixed locat ion • WAFL st ores met a-dat a in f iles • Ot her blocks can be wr it t en anywher e – I node f ile –st or es inodes – Block-map f ile –st ores f ree blocks – I node -map f ile –ident if ies f ree inodes 2

  3. Snapshot s (2 of 2) Snapshot s (1 of 2) • When disk block modif ied, must modif y • Copy root inode only indirect point ers as well • Over t ime, snapshot ref erences more and more dat a blocks t hat are not used • Rat e of f ile change det ermines how many snapshot s you want t o st ore • Bat ch, t o improve I / O perf ormance Consist ency Point s (2 of 2) Consist ency Point s (1 of 2) • WAFL use of NVRAM • I n order t o avoid consist ency checks af t er – NFS request s are logged t o NVRAM unclean shut down, WAFL creat es special • NVRAM has bat t eries t o avoid losing during powerof f snapshot called a consist ency point every – Upon unclean shut down, re-apply NFS request s t o last consist ency point f ew seconds – Upon clean shut down, creat e consist ency point and – Not accessible via NFS t urnof f NVRAM • Bat ched operat ions are writ t en each • Not e, t ypical FS uses NVRAM f or wr it e cache – Uses more NVRAM space (WAFL logs are smaller) consist ency point • Ex: “rename” needs 32 KB, WAFL needs 150 byt es • I n bet ween consist ency point s, dat a only • Ex: writ e 8KB needs 3 blocks (dat a, inode, indirect writ t en t o RAM point er), WAFL needs 1 block (dat a) plus 120 byt es f or log – Slower response t ime t han WAFL Writ e Allocat ion Out line • Writ e t imes dominat e NFS perf ormance – Read caches at client are large • I nt roduct ion (done) – 5x as many wr it e oper at ions as r ead at • Snapshot s : User Level server (done) • WAFL bat ches writ e request s • WAFL I mplement at ion (done) • WAFL allows writ e anywhere, enabling • Snapshot s: Syst em Level (next ) inode next t o dat a • Per f or mance – Typical FS has inode inf or mat ion and f r ee • Conclusions blocks at f ixed locat ion • WAFL allows writ es in any order since uses consist ency point s – Typical FS wr it es in f ixed or der t o allow fsck t o wor k 3

  4. The Block-Map File Creat ing Snapshot s • Typical FS uses bit f or each f r ee block, 1 is allocat ed and 0 is f ree • Could suspend NFS, cr eat e snaphost , – I nef f ect ive f or WAFL since may be ot her r esume NFS snapshot s t hat point t o block • WAFL uses 32 bit s f or each block – But can t ake up t o 1 second • Challenge: avoid locking out NFS request s • WAFL marks all dirt y cache dat a as I N_SNAPSHOT – NFS r equest s can r ead syst em dat a, modif y dat a not I N_SNAPSHOT – Dat a not I N_SNAPSHOT not f lushed t o disk • Must f lush I N_SNAPSHOT dat a as quickly as possible Flushing I N_SNAPSHOT Dat a Out line • Flush inode dat a f ir st – Keeps t wo caches f or inode dat a, so can copy • I nt roduct ion (done) syst em one t o inode dat a f ile, unblocking most NFS request s (requires no I / O since inode f ile f lushed • Snapshot s : User Level (done) lat er) • WAFL I mplement at ion • Updat e block-map f ile (done) – Copy act ive bit t o snapshot bit • Snapshot s: Syst em Level (done) • Wr it e all I N_SNAPSHOT dat a • Per f or mance (next ) – Rest art any blocked request s • Conclusions • Duplicat e r oot inode and t ur n of f I N_SNAPSHOT bit • All done in less t han 1 second, f ir st in 100s of ms Per f or mance (2 of 2) Perf ormance (1 of 2) • Compar e against NFS syst ems • Best is SPEC NFS – LADDI S: Legat o, Auspex , Digit al, Dat a General, I nt er phase and Sun • Measure response t imes versus t hroughput • (Me: Syst em Specif icat ions?!) (Typically, car e f or knee in cur ve) 4

  5. NFS vs. New File Syst ems Co nc l us i o n 14 10 MPFS Clients 12 • N e t A p p w o r k s an d i s s t ab l e 5 MPFS Clients & Response Time (Msec/Op) 10 5 NFS Clients • Cons i s t e nc y p oi nt s s i mp l e , r e d uc i ng b ugs i n 10 NFS Clients 8 c o d e 6 • Eas i e r t o d e v e l o p s t ab l e c o d e f o r ne t wo r k 4 ap p l i anc e t h an f or ge ne r al s y s t e m 2 – Few N FS cl i ent i mpl ement at i ons and l i mi t ed 0 0 1000 2000 3000 4000 5000 set of oper at i ons so can t est t h or ough l y Generated Load (Ops/Sec) • R e m o v e N F S s e r v e r a s b o t t l e n e c k • Cl i e n t s w r i t e d i r e c t l y t o d e v i c e 5

Recommend


More recommend