Follow the Sun through the Clouds: Application Migration for Geographically Shifting Workloads Robbert van Renesse Cornell University Joint work with Zhiming Shen, Qin Jia, Gur-Eyal Sela, Ben Rainero Weijia Song, Hakim Weatherspoon 1
Infrastructure as a Service (IaaS) Clouds • Offer on-demand virtual machines (VMs) • Pay-as-you-go: charge according to used hours • Provide useful services such as auto-scaling and failure recovering 2
Handling Geographically Shifting Workloads Follow the sun 3
Handling Geographically Shifting Workloads • Lack of homogeneous interface • Lack of privileged control • Lack of infrastructure support • Lack of common resource Follow the sun management 4
Supercloud Overview • Application migration as a service across cloud providers and availability zones • Support ALL major virtualization platforms and ALL major public cloud providers • Live migration without changing IP addresses or breaking TCP connections • Automatic scheduling framework • Optimize metrics such as average perceived latency • Provide cross-cloud storage and networking solution 5
Supercloud Overview • Computation Google Compute Cornell Red • Nested hypervisor: Xen-Blanket Engine Cloud • Support all major platforms User User Secon VMs VMs User User • Network First Secon VMs VMs d OpenStack First Layer d Layer OpenStack • SDN overlay Xen-Blanket Layer Software Defined Layer Xen-Blanket • Support migration with public IP Xen/PV-on-HVM Network (SDN) KVM/virtio • Storage: and Geo-replicated Image Store • Geo-replicated storage • Optimized for serving VM images User User • Resource management Secon VMs VMs Microsoft First d OpenStack Layer • OpenStack platform Azure Layer Xen-Blanket Xen/Hyper-V
7
Nested Virtualization Xen-Blanket • Second Layer Hypervisor • Uniformity Guest VM Dom0 DomU Second-layer provider Xen-Blanket Hyper-V Xen KVM First-layer provider
Supercloud Networking • Goal: • Inter-connection VM Cloud 1 • Optimized routing vSwitch • Supporting migration vSwitch vSwitch • VPN overlay • Full-mesh tunnels vSwitch vSwitch Cloud 2 VM • Frenetic SDN controller • Transparent VM migration • Public IP address support 10
VM Migration with Public IP Address VM 54.172.26.213 Pub IP front-end 54.172.26.213 11
VM Migration with Public IP Address VM 54.172.26.213 Pub IP Pub IP front-end front-end 52.69.94.195 12
Centralized VM Image Storage Long latency; Low throughput VM Image 13
Geo-Replicated VM Image Storage VM Image Image Challenges: • Strong consistency requirement • Long latency and low throughput in WAN 14
Decouple Consistency and Data Propagation • Version number • Location of the latest block Cloud 2 Cloud 1 VM VM VM VM NFS/iSCSI NFS/iSCSI Strong Consistency Data View Global Meta-Data Global Meta-Data Consistency Layer Data Eventual Consistency Data Store Local Meta- Propagation Propagation Local Meta- Propagation Data Manager Manager Data Layer On-demand fetch Back-End Storage Back-End Storage • Local version number Pro-active data propagation 15
Global Meta-Data Propagation Cloud 2 Cloud 1 • Challenge: VM VM VM VM • Long latency NFS/iSCSI NFS/iSCSI Strong Consistency Data View Global Meta-Data Controller Global Meta-Data • Observation: Layer Local Local Propagation Propagation Data Store • Single writer Manager Manager Meta-Data Meta-Data Layer • No read-write sharing Back-End Storage Back-End Storage • Relaxed consistency model • Close-to-open consistency • Propagation policy • Commit locally • Flush to centralized controller when closing 16
Evaluation: ZooKeeper Migration • Application level vs. VM level migration ZooKeeper Dynamic Supercloud VM migration Reconfiguration • Code complexity Add/remove nodes: 6700+ No code change lines of code change • Leader rotation: not supported yet Transparency Clients need to be notified Completely transparent Performance Several seconds of downtime Little performance impact due to state synchronization and leader election 17
Comparing ZooKeeper Migration Mechanisms Leader is separated from the • Initially: Asia 1, US 2 majority • 2-step reconfiguration: 20second • Asia + 1, US -1 performance degradation • 3-step reconfiguration: • Asia +2, US -2 • Asia -1, US +1 • Supercloud • Migrate the leader from US to Asia 18
Follow the Sun • Experimental Setup • Global ZooKeeper deployment in US and Asia • MSN trace • Comparing different deployments • US Ensemble : all ZooKeeper nodes in the US • Global Ensemble : majority in US, one node in Asia • Dynamic Ensemble : using Supercloud VM migration 19
Follow the Sun 20
Supercloud Scheduler • Decides placement and migration automatically • Requires run-time monitoring and performance models for cloud resources 21
Memory Performance Measurements / Anomalies 22
Partners in crime • NIST ANTD (Advanced Network Techologies Divison): Monitoring and Security • Abdella Battou • Fred de Vaulx • Lotfi Benmohamed • Charif Mahmoudi • Cornell Aristotle Project and XSEDE Academic cloud sharing and bursting • David Lifka (Cornell CIO) • … 23
Conclusion • Supercloud: application migration for geographically shifting workloads • Crossing heterogeneous cloud providers • Automatic scheduling • Geo-replicated image storage • Wide-area SDN • Visit our workshop tomorrow morning (Thursday) • We’ll also present exciting cloud performance comparison studies • More at http://supercloud.cs.cornell.edu Thank You. Questions? 24
Recommend
More recommend