HPIS3: Towards a High-Performance Simulator for Hybrid Parallel I/O - - PowerPoint PPT Presentation

hpis3 towards a high performance simulator for hybrid
SMART_READER_LITE
LIVE PREVIEW

HPIS3: Towards a High-Performance Simulator for Hybrid Parallel I/O - - PowerPoint PPT Presentation

HPIS3: Towards a High-Performance Simulator for Hybrid Parallel I/O and Storage Systems Bo Feng , Ning Liu, Shuibing He, Xian-He Sun Department of Computer Science Illinois Institute of Technology, Chicago, IL Email: {bfeng5,


slide-1
SLIDE 1

HPIS3: Towards a High-Performance Simulator for Hybrid Parallel I/O and Storage Systems

Bo Feng, Ning Liu, Shuibing He, Xian-He Sun Department of Computer Science Illinois Institute of Technology, Chicago, IL Email: {bfeng5, nliu8}@hawk.iit.edu, {she11, sun}@iit.edu

slide-2
SLIDE 2

Outline

  • Introduction
  • Related Work
  • Design and Implementation
  • Experiments
  • Conclusions and Future Work

Speaker: Bo Feng 2

slide-3
SLIDE 3

Outline

  • Introduction
  • Related Work
  • Design and Implementation
  • Experiments
  • Conclusions and Future Work

Speaker: Bo Feng 3

slide-4
SLIDE 4

To Meet the High I/O Demands

  • 1. PFS
  • 2. SSD

50 100 150 200 250 300 50 100 150 200 250 300 BANDWITH (MB/SEC) REQUEST SIZE (KB) SSD-seq SSD-ran HDD-seq HDD-ran Speaker: Bo Feng 4

slide-5
SLIDE 5

HPIS3: Hybrid Parallel I/O and Storage System Simulator

  • Parallel discrete event simulator
  • A variety of hardware and software configurations
  • Hybrid settings
  • Buffered-SSD
  • Tiered-SSD
  • HDD and SSD latency and bandwidth under parallel file systems
  • Efficient and high-performance

Speaker: Bo Feng 5

Event 1 Event 2 Event 3 Event 4 Event 5

slide-6
SLIDE 6

Outline

  • Introduction
  • Related Work
  • Design and Implementation
  • Experiments
  • Conclusions and Future Work

Speaker: Bo Feng 6

slide-7
SLIDE 7

Related Work

  • S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems [1]
  • A Cost-Aware Region-Level Data Placement Scheme For Hybrid

Parallel I/O Systems [2]

  • On the Role of Burst Buffers in Leadership-Class Storage Systems

[3]

  • iBridge: Improving Unaligned Parallel File Access with Solid-State

Drives [4]

  • More…

Speaker: Bo Feng 7

[1] S. He, X.-H. Sun, and B. Feng, “S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems,” in Proceedings of International Conference on Distributed Computing Systems (ICDCS), 2014. [2] S. He, X.-H. Sun, B. Feng, X. Huang, and K. Feng, “A Cost-Aware Region-Level Data Placement Scheme for Hybrid Parallel I/O Systems,” in Proceedings of 2013 IEEE International Conference on Cluster Computing (CLUSTER), 2013. [3] N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn, “On the Role of Burst Buffers in Leadership-Class Storage Systems,” in Proceedings of 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), 2012. [4] X. Zhang, K. Liu, K. Davis, and S. Jiang, “iBridge: Improving unaligned parallel file access with solid-state drives,” in Proceedings of the 2013 IEEE 27th International Parallel and Distributed Processing Symposium (IPDPS), 2013.

Co-design tool for hybrid parallel I/O and storage systems

slide-8
SLIDE 8

Outline

  • Introduction
  • Related Work
  • Design and Implementation
  • Experiments
  • Conclusions and Future Work

Speaker: Bo Feng 8

slide-9
SLIDE 9

Design Overview

  • Platform: ROSS
  • Target: PVFS
  • Architecture Overview
  • Client LPs
  • Server LPs
  • Drive LPs
  • Note: LP is short of logical process. They act

like real processes in the system and are synchronized by Time Warp protocol.

Application I/O Workloads

SSD HDD

Node Node Node

...

HDD

PVFS Clients PVFS Servers

...

Storage Devices

Client Client Client

...

Speaker: Bo Feng 9

slide-10
SLIDE 10

File, Queue and PVFS Client Modeling

  • File requests and request queue

modeling

  • <file_id, length, file_offset>
  • State variables define queues
  • PVFS client modeling
  • Stripping mechanism

Request File Servers File Str S (a) (b) (c) S Str Sn S File Servers File Servers Request File Request File S

Speaker: Bo Feng 10

slide-11
SLIDE 11

PVFS Sever Modeling

  • Connected with clients and

drives

SSD

Node Client Client Client

...

Speaker: Bo Feng 11

slide-12
SLIDE 12

Event flow in HPIS3: a write example

  • Write event flow for HDD
  • Single-queue effect
  • Write event flow for SSD
  • Multi-queue effect

Client Events Client Requests Server Events Drive Events

write FWRITE_init prelude FWRITE_posit ive_ack FWRITE_start _flow HW_START HW_READY WRITE_ACK WRITE_END FWRITE_com pletion_ack FWRITE_dataf ile_complete_o perations FWRITE_io_g etattr FWRITE_insp ect_attr FWRITE_io_d atafile_setup FWRITE_dataf ile_post_msgpa ris FWRITE_clea nup FWRITE_term inate end release

Speaker: Bo Feng 12

slide-13
SLIDE 13

Storage Device Modeling: HDD vs. SSD (1)

HDD SSD

FWRITE_start _flow HW_START HW_READY WRITE_ACK WRITE_END FWRITE_com pletion_ack HW_READY WRITE_ACK WRITE_END HW_READY WRITE_ACK WRITE_END HW_READY WRITE_ACK WRITE_END FWRITE_start _flow HW_START HW_READY WRITE_ACK WRITE_END FWRITE_com pletion_ack

Speaker: Bo Feng 13

slide-14
SLIDE 14

Storage Device Modeling: HDD vs. SSD (2)

HDD SSD

Speaker: Bo Feng 14

Read Write Sequential SR SW Random RR RW

  • Start up time
  • Seek time
  • Data transfer time
  • Start up time
  • FTL mapping time
  • GC time
slide-15
SLIDE 15

Hybrid PVFS I/O and Storage Modeling

Speaker: Bo Feng 15

Application H H H H S S Application H H H H S S Sservers Hservers

Buffer Storage SSD Tier HDD Tier

  • S is short for SSD-Server, which is a server node with SSD.
  • H is short for HDD-Server, which is a server node with HDD.

Buffered-SSD setting Tiered-SSD setting

slide-16
SLIDE 16

Outline

  • Introduction
  • Related Work
  • Design and Implementation
  • Experiments
  • Conclusions and Future Work

Speaker: Bo Feng 16

slide-17
SLIDE 17

Experimental setup

  • 32 nodes used throughout our experiments in this study

65-nodes SUN Fire Linux Cluster CPU Quad-Core AMD Opteron(tm) Processor 2376 * 2 Memory 4 * 2GB, DDR2 333MHz Network 1 Gbps Ethernet Storage HDD: Seagate SATA II 250GB, 7200RPM SSD: OCZ PCI-E X4 100GB OS Linux kernel 2.6.28.10 File system OrangeFS 2.8.6

Speaker: Bo Feng 17

slide-18
SLIDE 18

Benchmark and Trace tool

  • IOR
  • Sequential read/write
  • Random read/write
  • IOSIG
  • Conducted from trace replay to trigger events

Speaker: Bo Feng 18

slide-19
SLIDE 19

Simulation Validity

  • 8 clients
  • 4 HDD-servers
  • 4 SSD-servers
  • Lowest error rate is 2%
  • Average error rate is 11.98%

50 100 150 200 250 300 350 400 450 500 4 32 256 2048 16384 Throughput (MB/Sec) Transfer Size (KB)

H-ran S-ran H-sim-ran S-sim-ran

Speaker: Bo Feng 19

slide-20
SLIDE 20

Simulation Performance Study

  • 32 physical nodes
  • 2048 clients
  • 1024 servers
  • # of processes from 2 to 256

50 100 150 200 250 300 350 400

10,000 20,000 30,000 40,000 50,000 60,000 70,000

2 4 8 16 32 64 128 256

Running time (sec) Event rate (evt/sec) # of processes Event rate (events/sec) Running time (sec)

Speaker: Bo Feng 20

slide-21
SLIDE 21

Case study: Tiered-SSD Performance Tuning

  • 16 clients
  • 64K random requests
  • 4 HDD-servers + 4 SSD-servers
  • Performance boosts about 15%

for original setting

  • Performance boosts about 140%

for tuned setting

86.053 275.67 99.92 240.96 50 100 150 200 250 300

4HDD 4SSD Original Tuned Throughput (MB/sec)

Speaker: Bo Feng 21

slide-22
SLIDE 22

Outline

  • Introduction
  • Related Work
  • Design and Implementation
  • Experiments
  • Conclusions and Future Work

Speaker: Bo Feng 22

slide-23
SLIDE 23

Conclusions and Future Work

  • HPIS3 simulator: a hybrid parallel I/O and storage simulation system
  • Models of PVFS clients, servers, HDDs and SSDs
  • Validate against benchmarks
  • Minimum error rate is 2% and average is about 11.98% in IOR tests.
  • Scalable: # of processes from 2 to 256
  • Showcase of tiered-SSD settings under PVFS
  • Useful to find optimal settings
  • Useful to self-tuning at runtime
  • Future work
  • More evaluation for tiered-SSD vs. buffered-SSD
  • Improve accuracy by detailed models
  • Client-side settings and more

Speaker: Bo Feng 23

slide-24
SLIDE 24

Thank you Questions?

Bo Feng bfeng5@hawk.iit.edu

Speaker: Bo Feng 24