CloudNet : Dynamic Pooling of Cloud Resources by Live WAN Migration of Virtual Machines Timothy Wood , Prashant Shenoy University of Massachusetts Amherst K.K. Ramakrishnan, and Jacobus Van der Merwe AT&T Labs - Research VEE 2011 Thursday, March 10, 2011
Cloud Isolation • Cloud data centers are isolated from one another and the enterprise • Complicates • Deployment • Security • Resource management Need a way to flexibly manage IT resources across data centers Enterprise View Tim Wood - UMass Amherst 2 Thursday, March 10, 2011
Vision: Virtual Cloud Pools • Flexible cross data center resource pools • A secure collection of server, storage, and network resources • Seamlessly connected across cloud and enterprise data centers • Supports dynamic application placement across sites Cloud Enterprise Sites Sites Tim Wood - UMass Amherst 3 Thursday, March 10, 2011
Cloud Pool Use Cases • Enterprise Consolidation • Simplify deployment into the cloud • Minimize downtime and reconfiguration • Cloud Bursting • Many applications cannot be easily replicated • WAN Migration enables dynamic placement • Minimize performance impact • Follow the Sun • Application moves to be closer to clients or data • Minimize planning time and migration bandwidth cost Tim Wood - UMass Amherst 4 Thursday, March 10, 2011
Dynamic Cloud Pools • Goals: • Seamlessly and securely connect enterprise and cloud data centers • Enable efficient migration of resources between data centers • Challenges • Networking: security, transparency, flexibility • WAN Migration: Efficiency, application performance impact 16 Downtime (sec) SpecJBB Pause Time (sec) 14 Kernel Compile 12 TPC-W 10 8 6 4 2 0 0 250 500 750 1000 Bandwidth (Mbps) Tim Wood - UMass Amherst 5 Thursday, March 10, 2011
Outline • Introduction • Seamless Connections with VPNs • Optimizing WAN Migration of VMs • Implementation & Evaluation • Conclusions Tim Wood - UMass Amherst 6 Thursday, March 10, 2011
Connectivity Challenges • Current approaches lack... • Security • Firewalls too fine grain, difficult to manage dynamically • Transparency • Cloud resources have own public IP range separate from enterprise • Flexibility • Complex reconfiguration required to add resources or move them between sites Tim Wood - UMass Amherst 7 Thursday, March 10, 2011
Seamless Data Center Connections • CloudNet: Use Virtual Private Networks (VPNs) • Creates secure end-to-end network paths • Simpler configuration than firewalls • Managed by network provider with no end host configuration • Layer 2 Virtual Private LAN Service (VPLS) • Bridges the local networks at multiple sites • Makes cloud resources look as if directly attached to enterprise LAN • Allows existing VM migration techniques to work over the WAN! Tim Wood - UMass Amherst 8 Thursday, March 10, 2011
Dynamic VPN Endpoints • Manipulating VPN endpoints can be slow • Manual process, can take days... must reduce this to seconds • CloudNet automates VPN endpoint reconfiguration VPN Controller + + + • Centralized VPN Controller • Acts as route reflector between sites • Can adjust ruleset to modify VPN topology • Route updates propagated via BGP Tim Wood - UMass Amherst 9 Thursday, March 10, 2011
Outline • Introduction • Seamless Connections with VPNs • Optimizing WAN Migration of VMs • Implementation & Evaluation • Conclusions Tim Wood - UMass Amherst 10 Thursday, March 10, 2011
WAN Migration Challenges • Existing approaches not well optimized for WAN • Requires high bandwidth, low latency links (e.g. 622Mbps / 5msec) • [VMware/Cisco 09, Travostino 06] • Focus only on storage or ignore it completely • [Bradford 07, Ruth 06] • Need to support moving full VM state • Disk storage • Memory data • Processor state • All with minimal impact on application performance Tim Wood - UMass Amherst 11 Thursday, March 10, 2011
VM Migration Procedure VPN Setup ARP t e N Pause VM m Live Mem Transfer e M k Asynchronous Copy Synchronous Copy s i D Time VM VM VM VM Memory Memory Memory Memory East Coast West Coast Tim Wood - UMass Amherst 12 Thursday, March 10, 2011
Optimizing WAN Migration • Redundancy Elimination : detect identical regions in memory or disk and only send once Cache Cache Zeroes Kernel VM Memory Source Non-0 Duplicates VM Memory Destination Compile TPC-W SPECjbb 0 10 20 30 40 50 60 70 Redundancy (% of RAM) Tim Wood - UMass Amherst 13 Thursday, March 10, 2011
Deltas • Page Deltas : only send delta for partially changed data blocks during the migration 2 Cache 1 2 3 4 1 1 2 Destination To send Page Diff Size Kernel compile TPCW 40K 40K Frequency Frequency 20K 20K 0 0 0 1000 2000 3000 4000 0 1000 2000 3000 4000 Delta Size (B) Delta Size (B) Tim Wood - UMass Amherst 14 Thursday, March 10, 2011
Smart Stop 242,987 • When to stop iterating? 15000 21,791 • Xen: when very few pages Sent Dirtied left or after 30 iterations 11250 Number of Pages • Goals: 7500 • Minimize total migration time 3750 • Minimize pause time 0 1 2 3 4 5 6 7 8 • Iterate until Sent < Dirtied Iteration • Reduce total time • Then, find local minimum for Dirtied • Reduce pause time Tim Wood - UMass Amherst 15 Thursday, March 10, 2011
Outline • Introduction • Seamless Connections with VPNs • Optimizing WAN Migration of VMs • Implementation & Evaluation • Conclusions Tim Wood - UMass Amherst 16 Thursday, March 10, 2011
Implementation • Uses Xen for memory migration • Added in-memory cache for redundancy and deltas • Modified control algorithm for smart stop • Disk migration based on DRBD • Has sync and async modes • Evaluated benefits of redundancy elimination with traces • VPN controller manages Juniper M7i routers • Can be remotely configured • Migration wrapper coordinates network, storage, and memory operations Tim Wood - UMass Amherst 17 Thursday, March 10, 2011
CloudNet Testbed • Testbed for exploring cloud services • 3 sites spread across the US • Illinois, Texas, and California • Small cluster of servers at each site • Can create multiple VCPs with resources at each site • Migrations performed over active AT&T network links Tim Wood - UMass Amherst 18 Thursday, March 10, 2011
Eval: Cloud Burst • From Richardson, TX to Chicago, IL 40 • 465 Mbps link 33 GB Transferred • 27 msec RTT 27 • >1200 KM distance 20 • Simultaneous migration of four VMs 13 • 10GB disk 500 mi 7 + 1.7GB RAM per VM 0 • Total BW consumption lowered Memory Disk Total from 37GB to 18GB Default • Memory migration time reduced Optimized from 245 to 87 sec • Downtime halved from 6 to 3 sec Tim Wood - UMass Amherst 19 Thursday, March 10, 2011
Eval: Application Performance • CloudNet reduces migration time • Shorter period with lower application performance Default Xen Memory Mig: 210s 150 CloudNet: 115s Resp Time (msec) Xen 100 Disk Transfer: 40 min CloudNet 50 0 38.5 39 40 41 42 43 44 Time (min) • Synchronous disk replication reduces application performance if latency is high Tim Wood - UMass Amherst 20 Thursday, March 10, 2011
Eval: Network Impact • Bandwidth : length of migration and pause time 250 8 Total Time (sec) Pause Time (sec) Xen Xen 7 200 CloudNet CloudNet 6 5 150 4 100 3 2 50 1 0 0 100 1000 100 1000 50 50 Bandwidth (Mbps) Bandwidth (Mbps) • Latency : application performance during migration 200 6 Pause Time (sec) Total Time (sec) 5 150 4 100 3 2 50 Xen Xen 1 CloudNet CloudNet 0 0 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80 Latency (msec) Latency (msec) Tim Wood - UMass Amherst 21 Thursday, March 10, 2011
Related Work • Private Clouds & Virtual Networks • VIOLIN [Ruth-ICAC 06] and VIRTUOSO [Sundararaj-VM 04] • Amazon VPC • Optimizing Migration • Compression [Jin-Cluster 09] • Model based [Breitgand-HotICE 11] • Deltas [Svard-previous talk] • Storage transfer [Zheng-next talk] Tim Wood - UMass Amherst 22 Thursday, March 10, 2011
Conclusions • CloudNet: end-to-end support for WAN migration • Network reconfiguration • Optimized memory and storage transfer • Minimizes migration cost • Bandwidth: Eliminate redundant data • Time: Reduce unnecessary iterations • Reduces application impact • Asynchronous bulk disk transfer • Minimize pause time Questions? Tim Wood - UMass Amherst 23 Thursday, March 10, 2011
Where to optimize? • Can do redundancy elimination within the network • Riverbed, [SmartRE] • ...but migration data is often encrypted • Our end-host based cache can be used for both RE and page deltas Dom-0 Dom-0 Memory Memory Cache Cache Virtual Virtual Machine Machine Storage Storage Cache Cache Xen Hypervisor Xen Hypervisor Tim Wood - UMass Amherst 24 Thursday, March 10, 2011
Recommend
More recommend