CUSTOMER WHO? Sten Spans Schuberg Philis @sspans (github, etc)
CUSTOMER TOPIC Going from 100 to 10000 systems Orchestrating a Zone Not Google-scale
CUSTOMER WHY? New Zone Rethink principles Automate Comments on Centos7/KVM Conceptual or Technical?
CUSTOMER WHAT?
CUSTOMER SUDO MAKE CLOUD Networking Hypervisors Storage Orchestration
CUSTOMER TOYS Source: https:/ /www.flickr.com/photos/rfc1036/406675831/
CUSTOMER STAFF
CUSTOMER GOAL
CUSTOMER GOAL
CUSTOMER CLOUDY https:/ /www.flickr.com/photos/versageek/493800514
CUSTOMER MISTAKES Artisinal / Pets Network not Scalable / Redundant Stretching Failure-domains Other technical downsides Lack of Automation
CUSTOMER WHAT IS ARTISINAL? People tracking MAC addresses Tweaking settings for each system Multiple sources of truth Validation / Acceptance test Naming - individual servers
CUSTOMER NAMING? Impacts automation Impacts labeling Impacts replacements Go for location-based identities!
CUSTOMER NETWORKING? Large layer2 domains Sharing networks between zones Manual configuration Not redundant (enough)? Or more failures due to redundancy?
CUSTOMER FAILURE DOMAINS Do you really want twin-datacenter? Clustering is complicated… Way more complicated failures… Have you actually tested failures?
CUSTOMER GOAL Manage zone as one unit Capture design / logic in config-management Versioned Iterations Think about naming Think about how you identify hosts Simplify…
CUSTOMER GOAL Stop managing individual servers (cattle) Stop being Artisanal Start scaling Start Orchestrating Think Terraform/CloudFormation/Heat
CUSTOMER BUILDING BLOCKS Isolated Networking Isolated Pods Worry-free Storage Optional: Dedicated SDN Clusters Fully orchestrated zones
CUSTOMER BOOTSTRAP NETWORK CORE Core Switches LoM switch Hypervisors SDN?
CUSTOMER CORE SWITCHES Linux based Bootstrap via DHCP/HTTP Chef/Ansible/Puppet supported! Capture design in cookbooks/playbooks Can run additional services
CUSTOMER SDN Cluster per (availability) Zone Failure Domain Features vs. Lock-in Complicated? Expensive? Accept tunnels between zones Customers will accept trade-offs!
CUSTOMER BOOTSTRAP A POD TOR Switch Pair LoM switch Hypervisors Storage
CUSTOMER TOR SWITCHES Linux Based Bootstrap via DHCP/HTTP Chef/Ansible/Puppet supported! Capture design in cookbooks/playbooks Can run DHCP/DNS per Pod Move pod services into the Pod
CUSTOMER LOM SWITCHES Can bootstrap via ToR switch Config via ToR Manage iLO’s via DHCP Hooks Would love a linux box here too
CUSTOMER HYPERVISORS Linux Based Automated Firmware Updates Bootstrap via DHCP/HTTP HTTP Bootstrap via Chef TFTP Proxy on ToR Location based DHCP (Option 82)
CUSTOMER HYPERVISOR HARDWARE Machines are extremely scalable Calculate cost per VM Waiting for 25G Ethernet Has anybody solved EFI PXE? Please?
CUSTOMER PROVISIONING Bootstrap via DHCP/HTTP Nekopan - Golang webserver Interfaces with Chef (or ansible/puppet)
CUSTOMER STORAGE Stable NFS – For now… API Driven No fancy replication / clustering
CUSTOMER DONE? Lets add all of this to cloudstack…
CUSTOMER CLOUDSTACK SDN providers need work cloudstack-setup-agent is … horrible Routervm/SystemVM Small networking issues And I bet there is more…
CUSTOMER THE HORROR:
Really? WTF? CUSTOMER WHAT IS GOING ON? All Ubuntu is the same… Fedora == Redhat 6 Centos == Redhat 5 Or you may have Redhat 7
CUSTOMER RESULTS ON CENTOS 7 Selinux is disabled (revert broken) Firewall changes don’t work for firewalld Cgroup changes are not that cool really Workarounds for old bugs results in breakage on newer systems So I reinstalled the box
CUSTOMER CENTOS 7 STATUS Selinux seems to work Labeled NFS is still bleeding edge No need to mess with cgroups Firewalld is pretty nice really Cloudstack should perhaps audit the config But please don’t change it…
CUSTOMER ROUTERVM We run ansible to hotfix/manage routervms But ip / kernel commandline not available on KVM L Qemu-guest-agent solves that and more… Libvmi – not sure
Recommend
More recommend