FIOS: A Fair, Efficient Flash I/O Scheduler Stan Park Kai Shen University of Rochester 1 / 21
Background Flash is widely available as mass storage, e.g. SSD $/GB still dropping, affordable high-performance I/O Deployed in data centers as well low-power platforms Adoption continues to grow but very little work in robust I/O scheduling for Flash I/O Synchronous writes are still a major factor in I/O bottlenecks 2 / 21
Flash: Characteristics & Challenges No seek latency, low latency variance I/O granularity: Flash page, 2–8KB Large erase granularity: Flash block, 64–256 pages Architecture parallelism Erase-before-write limitation I/O asymmetry Wide variation in performance across vendors! 3 / 21
Motivation Disk is slow → scheduling has largely been performance-oriented Flash scheduling for high performance ALONE is easy (just use noop) Now fairness can be a first-class concern Fairness must account for unique Flash characteristics 4 / 21
Motivation: Prior Schedulers Fairness-oriented Schedulers: Linux CFQ, SFQ(D), Argon Lack Flash-awareness and appropriate anticipation support Linux CFQ, SFQ(D): fail to recognize the need for anticipation Argon: overly aggressive anticipation support Flash I/O Scheduling: write bundling, write block preferential, and page-aligned request merging/splitting Limited applicability to modern SSDs, performance-oriented 5 / 21
Motivation: Read-Write Interference Intel SSD read (alone) Vertex SSD read (alone) CompactFlash read (alone) Probability density Probability density Probability density ← all respond quickly 0 1 2 0 0.2 0.4 0.6 0 100 200 300 I/O response time (in msecs) I/O response time (in msecs) I/O response time (in msecs) 6 / 21
Motivation: Read-Write Interference Intel SSD read (alone) Vertex SSD read (alone) CompactFlash read (alone) Probability density Probability density Probability density ← all respond quickly 0 1 2 0 0.2 0.4 0.6 0 100 200 300 I/O response time (in msecs) I/O response time (in msecs) I/O response time (in msecs) Intel SSD read (with write) Vertex SSD read (with write) CompactFlash read (with write) Probability density Probability density Probability density 0 1 2 0 0.2 0.4 0.6 0 100 200 300 I/O response time (in msecs) I/O response time (in msecs) I/O response time (in msecs) Fast read response is disrupted by interfering writes. 6 / 21
Motivation: Parallelism Read I/O parallelism Write I/O parallelism Intel SSD Vertex SSD Speedup over serial I/O 8 Speedup over serial I/O 4 6 3 4 2 2 1 0 0 1 2 4 8 16 32 64 1 2 4 8 16 32 64 Number of concurrent I/O operations Number of concurrent I/O operations SSDs can support varying levels of read and write parallelism. 7 / 21
Motivation: I/O Anticipation Support Reduces potential seek cost for mechanical disks ...but largely negative performance effect on Flash Flash has no seek latency: no need for anticipation? No anticipation can result in unfairness: premature service change, read-write interference 8 / 21
Motivation: I/O Anticipation Support 4 I/O slowdown ratio Read slowdown 3 Write slowdown 2 1 0 Linux CFQ, no antic. SFQ(D), no antic. Full−quantum antic. Lack of anticipation can lead to unfairness; aggressive anticipation makes fairness costly. 9 / 21
FIOS: Policy Fair timeslice management: Basis of fairness Read-write interference management: Account for Flash I/O asymmetry I/O parallelism: Recognize and exploit SSD internal parallelism while maintaining fairness I/O anticipation: Prevent disruption to fairness mechanisms 10 / 21
FIOS: Timeslice Management Equal timeslices: amount of time to access device Non-contiguous usage Multiple tasks can be serviced simultaneously Collection of timeslices = epoch; Epoch ends when: No task with a remaining timeslice issues a request, or No task has a remaining timeslice 11 / 21
FIOS: Interference Management Intel SSD read (with write) Vertex SSD read (with write) CompactFlash read (with write) Probability density Probability density Probability density 0 1 2 0 0.2 0.4 0.6 0 100 200 300 I/O response time (in msecs) I/O response time (in msecs) I/O response time (in msecs) Reads are faster than writes → interference penalizes reads more Preference for servicing reads Delay writes until reads complete 12 / 21
FIOS: I/O Parallelism SSDs utilize multiple independent channels Exploit internal parallelism when possible, minding timeslice and interference management Parallel cost accounting: New problem in Flash scheduling Linear cost model, using time to service a given request size Probabilistic fair sharing: Share perceived device time usage among concurrent users/tasks Cost = T elapsed P issuance T elapsed is the requests elapsed time from its issuance to its completion, and P issuance is the number of outstanding requests (including the new request) at the issuance time 13 / 21
FIOS: I/O Anticipation - When to anticipate? Anticipatory I/O originally used for improving performance on disk to handle deceptive idleness: wait for a desirable request. Anticipatory I/O on Flash used to preserve fairness. Deceptive idleness may break: timeslice management interference management 14 / 21
FIOS: I/O Anticipation - How long to anticipate? Must be much shorter than the typical idle period for disks Relative anticipation cost is bounded by α , where idle period is T service ∗ 1 − α where T service is per-task exponentially-weighted α moving average of per-request service time (Default α = 0 . 5) ex. I/O → anticipation → I/O → anticipation → I/O → · · · 15 / 21
Implementation Issues Linux coarse tick timer → High resolution timer for I/O anticipation Inconsistent synchronous write handling across file system and I/O layers ext4 nanosecond timestamps lead to excessive metadata updates for write-intensive applications 16 / 21
Results: Read-Write Fairness Average read latency Average write latency 4−reader 4−writer (with thinktime) on Intel SSD 4−reader 4−writer on Intel SSD 32 32 I/O slowdown ratio I/O slowdown ratio 24 24 16 16 proportional proportional slowdown slowdown ← ← 8 8 0 0 Raw device I/O Linux CFQ SFQ(D) Quanta FIOS Raw device I/O Linux CFQ SFQ(D) Quanta FIOS Only FIOS provides fairness with good efficiency under differing I/O load conditions. 17 / 21
Results: Beyond Read-Write Fairness Mean read latency Mean latency 4KB read Mean write latency Mean latency 128KB read 4−reader 4−writer on Vertex SSD 4KB−reader and 128KB−reader on Vertex SSD 16 8 I/O slowdown ratio I/O slowdown ratio 6 proportional slowdown ← 8 4 proportional slowdown ← 2 0 0 R L S Q F a i n F u I R L S Q F O i w u Q a a n F u I O S x n w u Q a d ( S D t x ( n e C a d D t ) C a v F e i v F ) c Q i Q e c e I / O I / O FIOS achieves fairness not only with read-write asymmetry but also requests of varying cost. 18 / 21
Results: SPECweb co-run TPC-C SPECweb and TPC−C on Intel SSD 12 Response time slowdown ratio 28 10 SPECweb 8 TPC−C 6 4 2 0 R L S Q F i F I a n u O w u Q a S x n ( d D t C a e ) v F i Q c e I / O FIOS exhibits the best fairness compared to the alternatives. 19 / 21
Results: FAWNDS (CMU, SOSP’09) on CompactFlash FAWNDS hash gets FAWNDS hash puts 3 Task slowdown ratio ← proportional slowdown 2 1 0 R L S Q F i I a n F u O w u Q a S x ( n d D C t e a ) v F i c Q e I / O FIOS also applies to low-power Flash and provides efficient fairness. 20 / 21
Conclusion Fairness and efficiency in Flash I/O scheduling Fairness is a primary concern New challenge for fairness AND high efficiency (parallelism) I/O anticipation is ALSO important for fairness I/O scheduler support must be robust in the face of varied performance and evolving hardware Read/write fairness and BEYOND May support other resource principals (VMs in cloud). 21 / 21
Recommend
More recommend