Percona XtraDB Cluster: Failure Scenarios and their Recovery Krunal - PowerPoint PPT Presentation

Scenario: Data inconsistency If there are multiple nodes in minority group, identify a node that has latest data. Set pc.bootstrap=1 on the selected node. Single node cluster formed 54

Scenario: Data inconsistency If there are multiple nodes in minority group, identify a node that has latest data. Set pc.bootstrap=1 on the selected node. Boot other majority node. (they will join through SST). 55

Scenario: Data inconsistency CLUSTER RESTORED 56

Scenario: Data inconsistency shutdown State marked as UNSAFE non-prim shutdown 57

Scenario: Data inconsistency majority group minority group Majority group has GOOD DATA 58

Scenario: Data inconsistency Nodes in majority group are already SHUTDOWN. Initiate SHUTDOWN of nodes from minority group. 59

Scenario: Data inconsistency Nodes in majority group are already SHUTDOWN. Initiate SHUTDOWN of nodes from minority group. Fix grastate.dat for the nodes from majority group. (Consistency shutdown sequence has marked STATE=UNSAFE). Valid uuid can be copied over from a minority group node. 60

Scenario: Data inconsistency Nodes in majority group are already SHUTDOWN. Initiate SHUTDOWN of nodes from minority group. Fix grastate.dat for the nodes from majority group. (Consistency shutdown sequence has marked STATE=UNSAFE). Bootstrap the cluster using one of the node from majority group and eventually get other majority nodes to join. 61

Scenario: Data inconsistency Nodes in majority group are already SHUTDOWN. Initiate SHUTDOWN of nodes from minority group. Fix grastate.dat for the nodes from majority group. (Consistency shutdown sequence has marked STATE=UNSAFE). Bootstrap the cluster using one of the node from majority group and eventually get other majority nodes to join. Remove grastate.dat of minority group nodes and restart them to join newly formed cluster. 62

Scenario: Data inconsistency CLUSTER RESTORED 63

Scenario: Another aspect of data inconsistency 64

Scenario: Another aspect of data inconsistency One of the node from minority group 65

Scenario: Another aspect of data inconsistency Transaction upto X - 1 Transaction upto X 66

Scenario: Another aspect of data inconsistency Transaction X caused inconsistency so it never made it to these nodes. Transaction upto X - 1 Transaction upto X 67

Scenario: Another aspect of data inconsistency Transaction upto X - 1 Transaction upto X 68

Scenario: Another aspect of data inconsistency Membership rejected as new coming node has one extra transaction than cluster state. Transaction upto X - 1 Transaction upto X 69

Scenario: Another aspect of data inconsistency 2 node cluster is up and it started processing transaction. Moving the state of cluster from X -> X + 3 70

Scenario: Another aspect of data inconsistency 2 node cluster is up and it started processing transaction. Moving the state of cluster from X -> X + 3 71

Scenario: Another aspect of data inconsistency Node got membership and node joined 2 node cluster is up and it started through IST too? processing transaction. Moving the state of cluster from X -> X + 3 72

Scenario: Another aspect of data inconsistency Node has transaction upto X and cluster says it has transaction 2 node cluster is up and it started upto X+3. processing transaction. Moving the state of cluster from X -> X + 3 Node joining doesn’t evaluate data. It is all dependent on seqno. 73

Scenario: Another aspect of data inconsistency User failed to remove grastate.dat that caused all this confusion. 74

Scenario: Another aspect of data inconsistency trx-seqno=x e m a s t h n t e i w r e n f f o i d i t c t a u e s b t n a o a d r n p T q u trx-seqno=x e s trx-seqno=x 75

Scenario: Another aspect of data inconsistency trx-seqno=x Cluster restored just to enter more inconsistency (that may detect in future). e m a s t h n t e i w r e n f f o i d i t c t a u e s b t n a o a d r n p T q u trx-seqno=x e s trx-seqno=x 76

Scenario: Cluster doesn’t come up on restart Avoid running node local operation. If cluster enter inconsistent state carefully follow the step-by-step guide to recover (don’t fear SST, it is for your good). 77

Scenario: Delayed purging 78

Scenario: Delayed purging Gcache (staging area to hold replicated transaction) 79

Scenario: Delayed purging Transaction replicated and staged 80

Scenario: Delayed purging All nodes finished applying transaction 81

Scenario: Delayed purging Transactions can be removed from gcache 82

Scenario: Delayed purging ● Each node at configured interval notifies other nodes/cluster about its transaction committed status ● This configuration is controlled by 2 conditions: ○ and gcache.keep_page_count gcache.keep_page_size ○ static limit on number of keys (1K), transactions (128), bytes (128M). ● Accordingly each node evaluates the cluster level lowest water mark and initiate gcache purge. 83

Scenario: Delayed purging Each node update local graph and evaluate N1_purged_upto: x+1 cluster purge watermark N2_purged_upto: x+1 N3_purged_upto: x N1_purged_upto: x+1 N1_purged_upto: x+1 N2_purged_upto: x+1 N2_purged_upto: x+1 N3_purged_upto: x N3_purged_upto: x 84

Scenario: Delayed purging cluster-purge-water-mark=X And accordingly all nodes will purge local N1_purged_upto: x+1 gcache upto X. N2_purged_upto: x+1 N3_purged_upto: x cluster-purge-water-mark=X cluster-purge-water-mark=X N1_purged_upto: x+1 N1_purged_upto: x+1 N2_purged_upto: x+1 N2_purged_upto: x+1 N3_purged_upto: x N3_purged_upto: x 85

Scenario: Delayed purging gcache page created and purged. 86

Scenario: Delayed purging New COMMIT CUT 2360 after 2360 from 1 purging index up to 2360 releasing seqno from gcache 2360 Got commit cut from GCS: 2360 87

Scenario: Delayed purging New COMMIT CUT 2360 after 2360 from 1 purging index up to 2360 releasing seqno from gcache 2360 Regularly each node Got commit cut from GCS: 2360 communicates, committed upto water mark and then as per protocol explained, purging initiates. 88

Scenario: Delayed purging 89

Scenario: Delayed purging STOP Gcache processing Transaction start to transaction pile up in gcache 90

Scenario: Delayed purging STOP Gcache processing Transaction start to transaction FTWRL, RSU … action that ● pile up in gcache causes node to pause and desync. 91

Scenario: Delayed purging Given that one of the node is not making progress it would not emit ● its transaction committed status. This would freeze the cluster-purge-water-mark as lowest ● transaction continue to lock-down. This means, though other nodes are making progress, they will ● continue to pile up galera cache. 92

Scenario: Delayed purging Given that one of the node is not making progress it would not emit ● its transaction committed status. This would freeze the cluster-purge-water-mark as lowest ● transaction continue to lock-down. This means, though other nodes are making progress, they will ● continue to pile up galera cache. Galera has protection against it. If number of transactions continue to grow beyond some hard limits it would force purge. 93

Scenario: Delayed purging trx map size: 16511 - check if status.last_committed is incrementing purging index up to 11264 releasing seqno from gcache 11264 In-build mechanism to force purge. 94

Scenario: Delayed purging trx map size: 16511 - check if status.last_committed is incrementing purging index up to 11264 releasing seqno from gcache 11264 Purge can get delayed but not halt. 95

Scenario: Delayed purging STOP Gcache processing transaction Force purge done 96

Scenario: Delayed purging STOP Purging means these entries Gcache processing are removed from galera maintained purge array. transaction (Physical removal of files gcache.page.0000xx is controlled by gcache.keep_pages_size and gcache.keep_pages_count) 97

Scenario: Delayed purging All nodes should have same configuration. Keep a close watch if you plan to run a backup operation or other operation that can cause node to halt. Monitor node is making progress by keeping watch on wsrep_last_applied/wsrep_last_committed. 98

Scenario: Network latency and related failures 99

Scenario: Network latency and related failures 10 0

Percona XtraDB Cluster: Failure Scenarios and their Recovery Krunal - PowerPoint PPT Presentation

Percona XtraDB Cluster: Failure Scenarios and their Recovery Krunal Bauskar (PXC Lead, Percona) Alkin Tezuysal (Sr. Technical Manager, Percona) Who we are? Krunal Bauskar Alkin Tezuysal (@ask_dba) Database enthusiast. Open Source

Percona XtraDB Cluster 8.0 Krunal Bauskar Percona XtraDB Cluster (PXC) Product Lead Quick Note

Understanding Percona XtraDB Cluster 5.7 Operation and Key Algorithms Krunal Bauskar PXC Product

Geographically Dispersed Percona XtraDB Cluster Deployment Marco (the Grinch) Tusa September

How we managed to scale Percona XtraDB Cluster (PXC) Krunal Bauskar PXC Product Lead @ Percona

Exploring Percona XtraDB Cluster using Performance Schema Krunal Bauskar PXC Product Lead

XtraDB 5.7: Key Performance Algorithms Laurynas Biveinis Alexey Stroganov Percona

Building an Enterprise Grade PostgreSQL Using Open Source Tools and Extensions Avinash Vallarapu

Percona Backup for MongoDB Akira Kurogane Percona 3 - 2 - 1 MongoDB Percona Server for

Immutable Database Infrastructure with PXC Satoshi Mitani | @mita2 Yahoo Japan Corporation

Providing Transparency for the Public Benefit A case study in running on-premise MySQL services

Percona Server for MySQL 8.0 Laurynas Biveinis Percona First of All, What Is Percona Server

with Dictionaries an alternative to InnoDB table compression Yura Sorokin, Senior Software

MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona

Consequence (LPHC) Failure Scenarios of CO 2 Pipelines and Wells Curtis Oldenburg Robert Budnitz

Maintenance Planning of Complex Power Grids based on Critical Cascading Failure Scenarios Eujeong

Diagnosing and Fixing MySQL Performance Problems Percona, Inc. http://www.percona.com/ 1

Welcome Back Day 2 Matt Yonkovit Percona Percona Live: Lots of Learning and Fun! Learned

Introducing The New PMM 2! Michael Coburn Percona Michael Coburn Product Manager for PMM

Speaking the same language as Developers and DBAs Michael Coburn Percona Michael Coburn

Extending and Customizing PMM Michael Coburn Percona Michael Coburn Product Manager for

In-depth Percona Server/MySQL encryption Robert Golebiowski Percona Keyrings Keyrings

Percona Xtrabackup Best Practices Marcelo Altmann Senior Support Engineer - Percona Agenda

Mesosphere and Percona Server for MongoDB Jeff Sandstrom, Product Manager (Percona) Ravi Yadav,

Mesosphere and Percona Server for MongoDB Peter Schwaller, Senior Director Server Eng. (Percona)