vPFS +: Managing I/ O Performance for Diverse HPC Applications Ming Zhao, Arizona S tate University Yiqi Xu, VMware ht t p:/ / visa.lab.asu.edu V irtualized I nfrastructures, S ystems, & A pplications
Background: HPC I/ O Management Increasing diverse HPC applications on shared storage o Different I/ O rates, sizes, and data/ metadata intensities Lack of I/ O QoS Mismatch ! differentiation o Parallel file systems treat all I/ Os equally APP1 APPn Generic parallel Compute S torage APP2 APPn file system I/Os nodes nodes APPn Ming Zhao, AS U 2
Background: vPFS Proxy-based interposition of application data requests o Transparent to applications, support different setups Proportional I/ O bandwidth scheduling using S FQ(D) o Work conserving, strong fairness Limitations? Write vs. Read Write vs. Random R/ W 140 Throughput (MB/s) App1 App1 120 HPC 100 App application 1 1.99:1 3.97:1 8.10:1 16:21 32.73:1 8.01:1 16.01:1 31.34:1 80 3.95:1 2.02:1 60 HPC 40 App application 2 Proxy SFQ(D) PFS 20 0 2:1 4:1 8:1 16:1 32:1 2:1 4:1 8:1 16:1 32:1 HPC App application n Target Ratio 3
Limitations Lack of isolat ion bet ween large and small workloads S FQ (D): st art -t ime fair queueing wit h I/ O dept h D o S tart times capture each flow’s service usage • Dispatch requests in the increasing order of their start times o D captures the available I/ O parallelism • Allow up to D of outstanding requests f 2 f n f 2 f n f 1 f 1 I/Os I/Os In practice In theory SFQ(D) SFQ(D) D=10 D=10 But used 14! Interference! 4
Limitations Lack of Met adat a I/ O scheduling Many HPC applicat ions are met adat a int ensive o Metadata I/ O performance is important 5
S olution: vPFS + S FQ(D)+ o A new scheduler to support diverse I/ O sizes Metadata I/ O management o An extension to support distributed scheduling of metadata requests PVFS 2-based real prototype Comprehensive experimental evaluation Ming Zhao, AS U 6
S FQ(D)+: Variable Cost I/ O Depth Allocation Allocate the limited I/ O depth D to outstanding requests based on their sizes o Consider D as the number of available I/ O slots • Each slot represents the cost of the smallest I/ Os o Each outstanding request occupies one or multiple slots based on its size f 2 f n f 1 • S top dispatching when D is used up I/Os Effectively protect small I/ O workloads o Low-rate I/ Os wait less for large outstanding SFQ(D) I/ Os to complete o S mall I/ Os are less affected by large I/ Os after dispatched D=10 Used 10 Ming Zhao, AS U 7
S FQ(D)+: I/ O Backfilling Large I/ Os at t he head of queue have t o wait t ill t here are enough slot s o Waste the currently available slots Backfill promot es small I/ Os t o ut ilize t he available slot s o S imilar to the backfill of small j obs in batch scheduling f 1 f 2 f 2 f 1 t 0 t 1 t 1 t 0 t 0 < t 1 t 0 < t 1 SFQ(D) SFQ(D) Backfill Wasted 1 D=10 D=10 Used only 9 Used 10 8
Metadata I/ O S cheduling Extends the scheduling to both data and metadata requests o Apply S FQ(D)+ to schedule metadata I/ Os on each server o Treat metadata I/ Os as small I/ Os Achieve total-metadata-service fair sharing for distributed metadata servers o Coordinate scheduling across distributed metadata servers o Each scheduler adj usts its scheduling of local metadata requests based on global metadata service distribution Ming Zhao, AS U 9
Evaluation Testbed o vPFS + implemented for PVFS 2 o 8 Clients & 8 S ervers, 1 gigabit switch Workloads o IOR: intensive checkpointing I/ Os o multi-md-test: intensive metadata I/ Os o BTIO: scientific application benchmark o WRF: real-world scientific application Ming Zhao, AS U 10
BTIO vs. IOR BTIO— Class C (4MB-16MB I/ Os), Class A (320B I/ Os) vPFS + substantially reduces BTIO slowdown S lows down IOR by 56% S lows down IOR by 99% Ming Zhao, AS U 11
WRF vs. IOR WRF— a large number of small I/ Os and int ensive met adat a request s vPFS + achieves 80% and 281% bet t er performance for WRF t han Nat ive and vPFS , respect ively Ming Zhao, AS U 12
Metadata I/ O S cheduling mult i-md-t est — mkt est dir, creat e, writ e, readdir, read, close, rm, rmt est dir vPFS + achieves nearly perfect fairness despit e dynamic met adat a demands for t wo met adat a-int ensive apps Ming Zhao, AS U 13
Conclusions I/ O diversity is becoming a top concern o Different types of requests (POS IX vs. MPI-IO, data vs. metadata) o Different I/ O rates and sizes vPFS + manages I/ O performance for diverse apps o S FQ(D)+ recognizes the variable cost of different I/ Os and takes it under control o Distributed metadata scheduling supports metadata- intensive applications Ming Zhao, AS U 14
Future Work Implement S FQ(D)+ directly into data/ metadata servers o Proxy-based scheduling may incur extra latency o But its impact to throughput is small (< 1% ) Evaluate vPFS + in larger and more diverse environments o Performance isolation is even more important on larger systems with more diverse workloads o Faster storage does not eliminate performance isolation the gap between processor and I/ O performance is still increasing Ming Zhao, AS U 15
Acknowledgement National S cience Foundation o CNS -1629888, CNS -1619653, CNS -1562837, CNS -1629888, CMMI-1610282, IIS -1633381 VIS A Lab @ AS U Thank you! Ming Zhao 16
Recommend
More recommend