Chenyang Lu Outline Online Data Migra.on in Storage Servers Control‐theore*c Framework Service delay control on Web servers Enterprise storage servers On‐line data migra*on in storage servers E-mail server; DB … need to move data ControlWare: adap*ve QoS control middleware I/Os System expansion Applica*on changes data Always‐on : e‐business, migration global data centers New device Online data migra*on Storage system 39 40 State of Prac.ce The Problem Need to bound impact E-mail server; DB… Execute a given migra*on plan on‐line on applications! Challenges Slow I/O ʼ s!!! Keep data consistent SAN Bound impact on applica*on performance New Complete migra*on quickly device Migration Script data plan migration storage Submover devices Storage system HP-UX LVM 41 42 Aqueduct Adap.ve solu.on Aqueduct E-mail server; DB… migration executor Feedback control loop: adapts migra*on speed based on applica*on I/O latency Monitor Enforce latency contract: Bounded average I/O latency I/Os { L i ( k )} SAN Application { LC i } Latency Complete migra*on in shortest *me allowed by contract Controller Contract Standard control‐theore*c design R m ( k ) Migration Actuator Systema*c methodology data plan migration Robust, analy*cally proven performance Handle different workloads and devices storage Submover devices Storage system HP-UX LVM 43 44 Quality of Service in Unpredictable 1 Computing Environments
Chenyang Lu Monitor Actuator Monitor Monitor Controller Controller Problem: fine‐grained control of migra*on speed using HP‐UX LVM Actuator Actuator Divide store into small (32 MB) substores (LVs) Submover moves substore using LVM silvering Measure applica*ons’ average I/O latency of each store in the last sampling window Mirror Split Current implementa*on: trace replayer directly monitors I/O latencies Silvering Can interface with performance monitoring tools (HP Openview) Actuator enforces a submove rate by sleeping submv submv sleep sleep 1 submv/sw sleep sleep sleep sleep 2 submv/sw Sampling Window Sampling Window 45 46 Controller Tuning controller parameters Monitor Controller Actuator Victim latency VL(k) : highest Approximate linear model Compute error for each store i average latency among all stores VL(k+1)–VL(k)= G (R m (k)-R m (k-1)) E i (k) = P*LC i ‐ L i (k) in the k th sampling window 0<P<1: safety margin, related to burs*ness Process gain G : Impact of k: represents the k th sampling window System profiling: Estimate G submove rate changes on Compute worst error victim latency. E min (k) = min{E i (k)} Construct transfer function Integral controller computes new submove rate: Stability R m (k) = R m (k‐1) + K*E min (k) Tracking: VL(k) = P*LC in Control Analysis Control gain K: aggressiveness of rate change steady state Compute K SeOling .me Satisfy 47 48 Experimental setup Experiments Baselines: no sleeping between (sub)moves Whole‐store: move one store at a *me Aqueduct FC-60 disk array Sub‐store: move one substore at a *me (1.05 TB, 5 RAID5 Logical Units) HP-UX 11 & LVM LU 0 HP 9000-N4000 Server Constant: steady Poisson streams emails metadata 8 440MHz processors Replace Logical Unit; migrate three 640‐MB stores. emails metadata Openmail Openmail: trace of an enterprise e‐mail server running HP I/O Trace Fibre Channel Openmail Add Logical Unit; migrate a 1854 MB store and a 96 MB store LU new Enterprise‐scale storage server 49 50 Quality of Service in Unpredictable 2 Computing Environments
Chenyang Lu Openmail: vic.m latency Measure G Tune K Process gain G: the slope Constant of the curves Average Victim Latency (ms) Control gain K Constant: K = 1.09 Openmail: K = 0.36 Openmail LC 0.8*LC Aqueduct Sub-store Whole-store 51 52 Openmail: latency Openmail: latency & submove rate LC Load highest on new LU towards end of migra*on Aqueduct uniformly better than baselines, but … By design, submove rate must be 1 or higher controller is working correctly 53 54 Openmail: average latency Openmail: latency CDF 91% Aqueduct 76% Sub-store Whole-store LC 55 56 Quality of Service in Unpredictable 3 Computing Environments
Chenyang Lu Related work Summary Migra*on must be executed adap*vely Aqueduct is neither overly aggressive Simpler versions of the problem Average I/O latency reduced by 76% Take (parts of) system offline Contract viola*on ra*o reduced by 78% Migrate data in “quiet periods” Nor overly conserva*ve Silvering in Logical Volume Manager [HP‐UX LVM, VxVM]: Average vic*m latency 15% lower than latency contract maintain data consistency, no QoS guarantees Propor*onal I/O scheduling: hard to handle unpredictability Future MS Manners: no guarantees to important tasks More detailed sensi*vity analysis Control‐theory‐based systems: distributed visual tracking, Web Self‐tuning controller servers, e‐mail server, database real‐*me processor Mul*‐dimensional QoS contracts scheduling ... 57 58 References C. Lu, G. A. Alvarez, J. Wilkes, Aqueduct: Online Data Migration with Performance Guarantees, USENIX Conference on File and Storage Technologies (FAST), 2002. G.A. Alvarez, C. Lu and J. Wilkes, Method and System for Online Data Migration on Storage Systems with Performance Guarantees, U. S. Patent 7,167,965, January 2007. 59 Quality of Service in Unpredictable 4 Computing Environments
Recommend
More recommend