Highly resilient, multi-region Keystone deployments Michael Richardson // 22 May 2018 1
2.1
3
Purpose Caveats SQL-backed deployment All regions are considered equal "Standard" regions, not singular Edge nodes 4
Compute Storage Network Images Dashboard Billing VPNaaS LBaaS Orchestration GPUs CaaS BYON BYOIP Region1 5
Compute Storage Network Images Dashboard Billing VPNaaS Identity LBaaS Orchestration GPUs CaaS BYON BYOIP Region1 6
Compute Storage Network Compute Storage Network Images Dashboard Billing Images Dashboard Billing VPNaaS Identity LBaaS VPNaaS LBaaS Identity Orchestration Orchestration GPUs CaaS BYON BYOIP GPUs CaaS BYON BYOIP Region1 Region2 7
Compute Storage Network Images Dashboard Billing VPNaaS Identity LBaaS Orchestration GPUs CaaS BYON BYOIP Region1 Compute Storage Network Compute Storage Network Images Dashboard Billing Images Dashboard Billing VPNaaS Identity LBaaS VPNaaS Identity LBaaS Orchestration Orchestration GPUs CaaS BYON BYOIP GPUs CaaS BYON BYOIP Region3 Region2 8
9
10
11
What if Keystone should fail? No API requests No Dashboard No metrics collection Data plane AOK No orchestration 12
Keystone (then) UWSGI + Nginx, HAProxy, Memcached MariaDB Galera cluster Giftwrapped python virtualenv Kilo, UUID tokens 13
The Internets External proxies API node1 API node2 API node3 Memcache Memecache Memcache node1 node2 node3 Internal proxies DB node1 DB node2 DB node3 14
Requirements Loss of a region should not affect the operation of any other region Major partition must continue as before Minor partition may continue in read-only mode All users and project data should be available in each region Self-healing 15
Options 16
Multiple Keystones A single Keystone 17
Federation Solves a different problem. 18
CockroachDB Too soon. http://lists.openstack.org/pipermail/openstack- dev/2017-May/117018.html 19
Master-slave replication Asynchronous. 20
Circular replication Ye-olde multi-master. 21
Galera site-to-site replication No benefit at this scale. 22
Design A single, inter-region Galera cluster, with an odd number of nodes Nodes in all regions Providing data for all regions One Keystone database Synchronous replication Inserts/updates/locks must be negligable Fernet tokens 23
Design Distinct authentication endpoints in each region, that back onto the same DB cluster. Other regions must be able to take over if all nodes are down Frontend, backend proxy configuration Region-specific service (Nova, neutron etc.) DB clusters But, pointing to the inter-region Keystone service 24
Implementation 25
Keystone upgrade Upgrade the Keystone APIs Kilo to Mitaka One release, micro-(scheduled)-outage Straightforward 26
Keystone configuration Multiple Keystone API nodes per region As before (Nginx + UWSGI) All sitting behind redundant HAProxy nodes 27
Multi-factor authentication Password plus TOTP Can be enabled per account by users Works with the APIs https://github.com/catalyst-cloud/adjutant-mfa 28
29
30
31
What's inside? curl $keystone_endpoint:35357/v3/auth/tokens \ -H "X-Subject-Token: {{ fernet_token }}" \ -H "X-Auth-Token: {{ admin_token }}" \ | python -m json.tool 32
33
Inter-region Galera cluster Redundant links between regions Odd number of nodes per region wsrep_dirty_reads = 1 wsrep_sync_wait = 0 wsrep_slave_threads = "n. cores" max_connections = "high" 34
Inter-node connectivity mysql = tcp/3306 galera replication = {tcp,udp}/4567 galera IST = tcp/4568 galera SST = tcp/4444 35
Testing Lock/load test across all regions pseudo-code { while true; do check cluster size, cluster status for each in region A B C; do locking-operation in region ${each} for region in A B C; do read and verify from ${region} done done done } 36
Memcache configuration One thread per core 512 MB for storage Objects cached for 60 minutes 37
38
Proxies and endpoints Final cutover Internal, external proxies updated Migrate to new cluster Endpoints in each region were now active, but not advertised. Services in each region were re-pointed to the local endpoint Additional endpoints now added to the Catalog Endpoints "unversioned" 39
Proxies and endpoints External proxy configuration Local endpoints utilise remote regions endpoints as backup servers Allows the other regions to take over Transparent failover 40
The Internets External proxies API node1 API node2 Memcache node1 API node3 Memecache node2 Memcache DB node1 Internal proxies node3 DB node2 DB node3 API node3 Memcache DB node3 node3 The Internets External proxies Internal proxies API node2 Memecache DB node2 node2 DB node1 API node1 Memcache node1 DB node1 DB node2 node1 Memcache Internal proxies DB node3 node2 API node1 Memecache node3 API node2 Memcache External proxies API node3 The Internets 41
42
43
44
45
Where to from here? Global endpoints MFA in/from master Keystone from $release n-1. 46
Summary Keystone: Fernet with caching Galera: single DB, geo-distributed, redundant paths 47
Summary External proxies Other regions must be present as backup servers Keepalived DNS round-robin between multiple VRRP addresses Configure each to have one master, the others as backup 48
Thank you 49
50
Recommend
More recommend