Become a SSD expert in minutes! Ryan Smith ryan.smith@ssi.samsung.com 408-205-8889
What is a SSD? • SSD = Solid State Drive • RAM- based introduced in 1970’s • Flash- based version in 1990’s • Today, it typically uses NAND Flash • 2012 is a big year for SSDs • Don’t complicate it.. it’s just a really fast drive! # of SSDs sold 35 Millions 30 25 20 15 10 5 0 2010 2011 2012 PC Server Storage Source : Samsung 2 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Why an SSD? • Three things that dictate the speed of your PC/Server: • CPU, DRAM, and HDD Everything is speeding up.. Except the HDD Memory: • Larger footprint • Higher bandwidth Processor: • Multi-core • Higher bandwidth Closing the gap with Solid State Storage Performance Storage: • Minor throughput improvements • Currently solved with spindles Time 3 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Why an SSD? • Lower response times (latency) • Higher IOPS and Throughput • Lower Power • No RVI Issues, More reliable Random Performance (IOPS) Power Consumption (Watt) SM825 SM825 15K RPM HDD 15K RPM HDD 43K X100 12.6 -75% 8.5 -87% 23K X60 X30 11K 3.2 1.1 Idle Active Read 70:30 Write Test Environment : Intel SR2600UR Server / IOMeter2008 Test Environment : Intel SR2600UR Server / IOMeter2008 / 4KB RND R70:W30 Source : Samsung 4 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
So what’s there to know about an SSD? SSD Key Characteristics 3,000 SSD Components NAND Characteristics P/E Cycles WAF TBW SMART Host Interface MLC Sustained vs. Peak Performance Benchmarking 1 1 0 0 SSD Influencers TRIM Over-provisioning Changing Workload User Area O/P Reserved 5 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
SSD Key Characteristics
SSD Components • Host/NAND Controller • Firmware • NAND Flash NAND • DRAM • Capacitors (optional) Controller DRAM Firmware Host Interface DRAM Controller Firmware NAND Flash All components work closely together SSD Image Source : Anandtech 7 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
NAND Characteristics • Types of NAND • TLC TLC PC Enterprise 500-1K P/E Cycles • MLC 1 year retention 1 1 1 0 0 0 • E-MLC • SLC MLC 3-5K P/E Cycles 1 year retention 1 1 0 0 Geometry / Lithography • 4xnm, 3xnm, 2xnm E-MLC 10-30K P/E Cycles • Smaller = Less Cost 3 month retention 1 1 0 0 1 1 1 1 1 1 0 0 0 0 0 0 SLC 90-100K P/E Cycles 3 mo – 1 yr retention 1 0 NAND Hierarchy • Pages: Smallest unit that can be read/written (e.g., 8KB) • Erase block: Groups of pages (e.g., 64 pages @ 8KB = 512KB) 8 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
P/E Cycles Program / Erase Cycles The # of times a given NAND cell can be programmed & erased • As geometries shrink, error correction must get better • It’s like a car warranty! • 3 years or 50,000 miles • 3 years or 3,000 P/E Cycles Not a useful characteristic by itself ECC Requirements 3,000 3xnm 2xnm 2ynm 9 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Write Amplification Factor (WAF) Write Amplification Factor Bytes written to NAND versus bytes written from PC/Server Bytes written to NAND WAF = Bytes written from Host • WAF 1 means 1MB from host writes 1MB to NAND • WAF 5 means 1MB from host writes 5MB to NAND • Factors that can affect WAF: Flash Translation Layer (FTL) Wear Leveling Controller Over-provisioning Garbage Collection Write Profile (Ran vs. Seq) Host Application Free user space / TRIM 10 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Write Amplification (WAF) Example • Below example illustrates WAF of 6 4KB from Host Host wants to update LBA 0 Host Z LBA 0 Cache Z Z B Z B Z B 24KB C D C D C D to NAND E F E F E F SSD A B A B Z B Flash LBA 0 C D C D C D E F E F E F Erase block Write modified page and No more free pages Need to erase entire block Read existing data to Cache old pages back to Flash Time 11 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
TBW TeraBytes Written # of terabytes you can write to the drive over it’s useful life (Capacity GB/1000) x PE Cycles TBW = WAF Examples: ((128GB / 1000) * 3000) / 5 = 76.8 TBW ((128GB / 1000) * 3000) / 2.5 = 153.6 TBW ((256GB / 1000) * 3000) / 5 = 153.6 TBW ((128GB / 1000) * 30000) / 5 = 768 TBW 12 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
SMART Look at health and various statistics Allows for predictable maintenance windows Calculate WAF, TBW Host GB written = [ID241] / (2/1024/1024) NAND GB written = [I D177] * Capacity GB WAF = NAND GB / Host GB Expected Life (yrs) = Warranty PE * ([ID9]/24/365) / [ID177] ID Attribute Name 5 Reallocated Sector Count 9 Power-on Hours 12 Power-on Count 177 Wear Leveling Count 179 Used Reserved Block Count 180 Unused Reserved Block Count 181 Program Fail Count 182 Erase Fail Count 187 Uncorrectable Error Count 195 ECC Error Count 199 CRC Error Count 241 Total LBA Written 13 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Host Interface • This is how you communicate to the SSD • So many choices.. • SATA • SAS • PCIe (NVMe, SCSIe, SATAe, Proprietary) Which is right for you? PC Server External Storage SATA SATA SATA + SAS bridge SAS SAS PCIe PCIe 14 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Sustained vs. Peak Performance • There can be significant differences in sustained vs. peak There is a BIG difference between • Run enterprise benchmark (e.g., SNIA RTP 2.0) “Value” and “Mainstream/Enterprise” SSDs when you have any degree of • Or even better, run your own workload (or simulated) writes in your workload Samsung PM830 vs Vendor “X” Samsung PM830 vs Vendor “X” 11x Sustained Random Writes 2x Sustained Sequential Writes [IOPS] [Ran. Performance @ 4KB] [MBs] [Seq. Performance @ 1MB] 95% below Peak 94% below 95% below Peak 99% below Peak Peak Over 10,000 IOPS! Vendor “X” 160GB SM825 200GB Vendor “X” 160GB PM830 128GB SM825 200GB PM830 128GB Value SSD Mainstream SSD Value SSD 4KB Ran. R/W 100/0 4KB Ran. R/W 65/35 4KB Ran. R/W 0/100 1MB Seq. R/W 100/0 1MB Seq. R/W 65/35 1MB Seq. R/W 0/100 (NCQ=16) (NCQ=16) (NCQ=16) (NCQ=16) (NCQ=16) (NCQ=16) Source : Samsung / SNIA RTP2.0 Benchmark 15 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Benchmarking Synthetic or actual workload & take measurements Benchmark URL SNIA RTP 2.0 http://www.snia.org/tech_activities/standards/curr_standards/pts Iometer http://sourceforge.net/projects/iometer/ ATTO Disk http://www.attotech.com/products/product.php?sku=Disk_Benchmark CrystalDiskMark http://crystalmark.info/software/CrystalDiskMark/index-e.html HD Tune Pro http://www.hdtune.com/ AS SSD (SSD) http://alex-is.de/PHP/fusion/downloads.php?download_id=9 Anvil (SSD) http://thessdreview.com/latest-buzz/anvil-storage-utilities-releases-new-storage-and-ssd-benchmark/ Scripts Have multiple “ dd ” running with best guess workload, capturing timing/speeds Real Workload Capture trace during real workload and playback (ioapps, blktrace/btereplay) 16 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
SSD Reviewers • Good SSD Review sites available.. 17 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
SSD Influencers
TRIM • Helps the SSD know which blocks aren’t used • Widely supported standard: Windows, Mac OS X, Linux, hdparm • Better sustained performance and extends TBW • Without TRIM, SSD only knows block isn’t used once the same LBA is written to No TRIM needed TRIM makes SSD aware Hi Hi Host Bye Bye TRIM LBA 0 LBA 0 LBA 0 LBA 1 LBA 0 Hi Hi Hi ? Hi Hi Bye Bye Bye SSD LBA 0 LBA 0 LBA 0 LBA 0 LBA 1 LBA 1 Time Time 19 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Over-Provisioning • Helps a few things: • Improves Write Performance • Reduces WAF, Increases TBW 128GB User Area O/P Reserved 28GB 100GB 28% O/P 128GB Base-2 to Base-10 conversion: 137,438,953,472 to 128,000,000,0000 (6.9%) Sample 128GB SSD 120GB 100GB Over-Provisioning 7% 28% Random Read (8K) IOPS 80K 80K Random Write (8K) IOPS 1,800 6,300 3.5x Sequential Read (64K) MB/s 500 500 Sequential Write (64K) MB/s 400 400 4KB Random WAF 5 1.35 -73% 4KB Random TBW 15 45 3x These performance numbers are fictitious but do represent the actual benefits seen during tests 20 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Change Write Workload • Write sequentially instead of random to reduce WAF • If you have control of the I/O to the disk, this will pay off Random Sequential MLC 512GB SSD 60 TBW 1250 TBW 20x Align your writes with the page boundaries (e.g., 8KB) 2 Pages needed Only 1 Page needed Host LBA 8 LBA 8K 16 8K Change block alignment SSD 8K 8K 8K 8K 8K 8K LBA 0 LBA 16 LBA 0 LBA 16 If alignment is too hard to implement, just increase your IO size 21 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Applications of SSDs
HDD Replacement • Replace boot drive or main storage • Fastest and easiest way to experience SSDs Server Storage HDD HDD SSD SSD 23 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Caching Appliance • Read and/or Write Cache • Sits between servers and storage, typically in a SAN • Used to speed up legacy or slower storage Cache Servers SSD Storage HDD 24 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx 팀
Recommend
More recommend