Rule breaker! Upgrading an OpenStack Cloud while skipping a release Rick Salevsky Nanuk Krinner SUSE Cloud Engineer SUSE Cloud Engineer rsalevsky@suse.com nkrinner@suse.com
Introduction
Speakers • Rick Salevsky – SUSE Cloud Engineer – Focus deployment solutions • Nanuk Krinner – Cloud developer at SUSE – Systems Management Engineer • SUSE OpenStack Cloud 3
Agenda • Why and What • Upgrade strategies • How we skipped a release • Where we want to go 4
Why and What
Why upgrading? • Security Fixes • Stability improvements • Performance improvements • Closely follow upstream development • New features 6
Problems while upgrading? • Downtime • Preparation • Testing • Adapting workflows • Bugs • Data loss 7
Customer demands • Reduced downtime • Live upgrade • Possible to roll back • Clear documentation of what is happening • Upgrading while skipping one or more releases 8
Upgrade marathon Evaluating the Planning the Testing the Release release upgrade upgrade Integrating Fine tuning Upgrade new features 9
Upgrade marathon Evaluating the Planning the Testing the Release release upgrade upgrade Maybe not this time? Integrating Fine tuning Upgrade new features 10
No upgrade at all • OpenStack User Survey April 2016 11
Upgrade strategies
Official upgrade process • Always upgrade to next release • Release cycle of 6 months • Upgrades are required regularly • High maintenance cost – Unexpected changes break upgrade – Lot’s of manual effort – Suffer the upgrade pain regularly Source: – Staffing http://www.openstack.org/brand/openstack-logo/logo-download/ 13
Continuous Deployment • Risky • Needs a lot of development manpower • Extensive testing required • Always latest greatest • No big upgrade, incremental changes Source: https://wiki.jenkins-ci. org/display/JENKINS/Logo • Not enterprise ready 14
Start from scratch • Roll out a fresh deployment • Lots of duplicated work – Set up projects, users, images… • Get rid of outdated artifacts • Run a parallel installation – Move workload from old cloud to new cloud • Redundant hardware required 15
Many self-made solutions • Own deployment solutions • Scenario-specific solutions 16
How we skip a release
High level overview • Upgrading from Juno to Liberty • Multistep process • Cloud is not fully functional • Upgrading OS along with OpenStack 18
The idea • Orchestrated reinstallation • Ignoring OpenStack Kilo release • Migration handling • Config file management • Still Downtime and Disruptive • No extra hardware required 19
Requirements • Orchestration mechanism • Configuration management • New OpenStack Packages are available • Enough disk space on Controller – Duplicated database • Shared disk for nova-compute data 20
Preparation • Stop configuration management • Update OpenStack configs • Check which migrations are needed 21
Backup data • Disable (not stop) all OpenStack Services • Shutdown OpenStack on non DB nodes • Backup OpenStack database • Backup other important data if wanted • Finalize OpenStack Shutdown 22
Setup new OpenStack Cloud • Reinstall Nodes with new OS if required • Install new OpenStack Packages • Start configuration management • Start database service • Restore backed up data 23
Migrating OpenStack Services • Run all migrations as documented for a upgrade • Special commands need porting • Juno to Liberty exceptions – Nova 24
Migrating Nova Service • Migrate to last Kilo migration level – Last kilo migration = 290 – ‘nova-manage db sync --version 290’ • Migrate Flavor data – Porting from Kilo to Liberty was required – ‘nova-manage db migrate_flavor_data’ • Migrate to Liberty migration level – ‘nova-manage db sync’ 25
Finalizing Upgrade • Start all OpenStack Services • Check if everything is running 26
Issues • Configuration File migration • Migrations • All or nothing • Predefined Upgrades 27
Do’s Dont’s • Backups • Hope everything runs • Test new configurations 28
Do’s and Dont’s 29
Where we want to go
Outlook • Seamless Upgrade • No downtime of important services • Reverting upgrades • Better orchestration 31
Call for Action • Config files (automatically) upgradeable • Uniform configuration files • Migrating existing data from every point • Rollback option • Non-disruptive upgrades • Integrating oslo version objects in every project 32
Questions? Thank you. 33
Real world issues • Skipped or delayed upgrades • Small operator teams • Downstream code changes • Ignoring recommended path OpenStack User Survey, April 2016 • Services depending on the cloud 35
Recommend
More recommend