adap ve policies for balancing performance and life me of
play

Adap%ve policies for balancing performance and life%me of mixed SSD - PowerPoint PPT Presentation

Adap%ve policies for balancing performance and life%me of mixed SSD arrays through workload sampling Sangwhan Moon A. L. Narasimha Reddy Texas A&M University 2 / 16 Outline Introduc%on Mixed SSD Arrays Workload distribu%on of


  1. Adap%ve policies for balancing performance and life%me of mixed SSD arrays through workload sampling Sangwhan Moon A. L. Narasimha Reddy Texas A&M University

  2. 2 / 16 Outline • Introduc%on – Mixed SSD Arrays – Workload distribu%on of mixed SSD array • Problem Statement • Selec%ve caching policies • Our approach – Online sampling – Adap%ve workload distribu%on • Evalua%on • Conclusion

  3. 3 / 16 Different classes of SSDs 100 10 Cost ($/GB) Low-­‑end SSDs High-­‑end SSDs 1 0.1 0.1 1 10 100 Device Writes Per Day (DWPD, higher is be>er)

  4. 4 / 16 Mixed SSD array • High-­‑end SSDs cache – Faster: PCIe interface – Reliable: SLC eMLC (write endurance = 100K) – Expensive per gigabyte • Low-­‑end SSDs main storage – Slower: Serial ATA interface – Less reliable: MLC TLC (write endurance < 30K) – Cheap per gigabyte

  5. 5 / 16 Workload distribu%on of mixed SSD array • LRU Caching Policy Read/write workload w C = m r r + w r , w read write w C , w S Writes per flash cell N C ⋅ C C Cache read/write miss rate 1. r 2. w m r , m w High-­‑end SSDs The number of SSDs N C , N S read miss dirty entry evic%on The capacity of SSD C C , C S 4. m r r 5.( m r r + m w w ) ⋅ d 3. m r r Write endurance of cache/storage l C , l S Low-­‑end SSDs w S = m w w ! $ min l C l S N S ⋅ C S Lifetime = , # & w C w S " %

  6. 6 / 16 Workload distribu%on of mixed SSD array • 1 high-­‑end SSD cache for 3 low-­‑end SSDs Item DescripKon SpecificaKon w C = 0.5 ⋅ 100 MB / s + 250 MB / s Capacity 100 GB High-­‑end SSD 1 ⋅ 100 GB (SLC) Write Endurance 100 K read write Capacity 200 GB Low-­‑end SSD (MLC) Write Endurance 10 K 1. r 2. w Read/write (MB/s) 100 / 250 High-­‑end SSDs Workload Read/write cache hit rate 50% / 15% read miss dirty entry evic%on 5.( m r r + m w w ) ⋅ d 4. m r r Read / write length 4KB / 64KB 3. m r r Low-­‑end SSDs high-­‑end low-­‑end w C = 0.85 ⋅ 250 MB / s ! $ Lifetime = min 1.47 years , 6.34 years # & 1 ⋅ 100 GB " %

  7. 7 / 16 Problem statement • High-­‑end SSDs cache can wear out faster than low-­‑end SSDs main storage – Caching less results in poor performance – Caching more results in poor reliability • Sta%c workload classifiers can be less efficient • The characteris%cs of workload can change over %me • Objec%ves – Balance the performance and life%me of cache and storage at the same %me metric : Latency over Life0me (less is be5er)

  8. 8 / 16 Selec%ve caching policies • Request Size based Caching Policy • Hotness based Caching Policy Sta0c workload classifiers cannot distribute workload across cache and storage precisely I/O requests whose sizes 90% of workload is reference are 4KB are domina%ng once and never accessed

  9. 9 / 16 Selec%ve caching policies • Control trade-­‑offs between performance and life%me p (threshold): the probability of caching data p is more: cache wears out faster, performance enhances p is less: cache wears slower, performance degrades read write 1. h r r 4. pw Frontend Cache bypassed bypassed read miss dirty entry evic%on read miss writes 6. m w wp 2. m r r 7. m r (1 − p ) r 5.(1 − p ) w 3. m r pr Backend Storage ProbabilisKc Caching Policy

  10. 10 / 16 Online sampling Es%mate latency over life%me for each sampling cache Employ best value of p , the Sampling Rate: 10% proximity of caching 1% 1% 1% 1% 1% 90 % . . . Sampling Sampling Sampling Sampling Sampling Selec%ve . . . Cache Cache Cache Cache Cache Cache p 0.2 1.0 1.0 – p 0.1 0.3 0.9 . . . LRU LRU LRU LRU LRU LRU Main Storage

  11. 11 / 16 Simula%on environment • Trace-­‑driven simulator • Microsog Research Cambridge I/O Block Trace – 13 enterprise applica%ons trace for a week • Cache provisioning = 5% – Cache size / Storage size • Unique data size of workload / Storage Size = 0.5 • Caching policies – LRU, size-­‑based (+ sampling), hotness-­‑based (+ sampling), probabilis%c (+ sampling)

  12. 12 / 16 Adap%ve threshold Hardware monitoring 10 latency 1 life%me metric Cache less Cache less Cache more Cache more 0.1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Threshold Threshold 10 Web server latency 1 life%me metric Cache less Cache more Cache less Cache more 0.1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Threshold Threshold Sampling based analysis StaKc threshold based analysis

  13. 13 / 16 Different workload traces • Overall, reduced latency over life%me by 60%. – Very effec%ve on some traces (mds, stg, web, prn, usr, proj, src1, src2) – Less effec%ve on very skewed workload (wdev, rsrch, ts, hm, prxy)

  14. 14 / 16 Different sampling rates • Higher sampling rate results in more accurate es%ma%on (beneficial) and less space for adap%ve cache (harmful)

  15. 15 / 16 Conclusion • We showed that high-­‑end SSD cache can wear out faster than low-­‑end SSD main storage. • We proposed sampling based selec%ve caching to balance the performance and life%me of cache and storage. • Trace-­‑based simula%on showed that the proposed caching policy is effec%ve.

  16. 16 / 16 Q & A

Recommend


More recommend