Real-time Replication in the Real World Richard E. Baum C. Thomas Tyler 2
Agenda Provide an overview of replication solutions • Discuss relevant new 2009.2 features • Review some real-world solutions • 3
Terminology High Availability (HA) • Typical Goal: Keep Perforce online 24x7 • Disaster Recovery (DR) • Business continuity • Murphy’s Law Insurance • Recovery Point Objective (RPO) • Targeted max data loss in various failure • scenarios Recovery Time Objective (RTO) • Targeted max time to recover from a failure • 4
Terminology Archive Files • Contains all versioned and shelved files • Metadata • All data in db.* files under P4ROOT • Read-Only Replica • Copy of live Perforce DBs for read-only • operations 5
Terminology Offline Checkpoint • Checkpoint created from replicated db.* files. • Perforce SDP (Server Deployment Package) • Server management scripts from Perforce • Consulting DRBD (Distributed Replicated Block Device) • Keep your eyes open for emerging technologies! • 6
7
8
High Availability Thinking We’re willing to invest in a more • sophisticated deployment architecture to reduce unplanned downtime. We will not accept data loss for any Single • Point of Failure (SPOF). Downtime is extremely expensive for us. • We are willing to spend a lot to reduce the likelihood of downtime, and minimize it when it is unavoidable. 9
High Availability Technologies Metadata: • Journal Truncation ( p4d -jj ) • • p4 replicate DAS/RAID or fast SAN for metadata • Archive Files: • SAN • p4 export – for metadata-driven archive • updates 10
To Cluster, or Not To Cluster? Perforce is not a cluster-aware application • Adds complexity and cost • Can reduce downtime • Simplifies automation of some failover tasks • DNS Switchover • Automatically mounting SAN Volumes • Perforce SDP designed to simplify cluster • failover 11
Sample HA Deployment (w/SAN) 12
Sample HA Deployment (w/DAS) 13
14
15
Disaster Recovery Thinking We’re willing to invest in a more • sophisticated deployment architecture to ensure business continuity in event of a disaster. We need to ensure accessibility of our • intellectual property, even in the event of a sudden and total loss of one of our data centers. 16
Disaster Recovery Technologies Metadata: • Journal Truncation ( p4d -jj ) • • p4 replicate Archive Files: • Rsync/Robocopy • Block-level WAN replication solutions • p4 export – for metadata-driven archive • updates 17
Sample DR Deployment 18
Read-Only Replica Thinking We have automation that interacts with • Perforce, such as continuous integration build systems or reports, that impact performance on our primary server. We’re willing to invest in a more • sophisticated deployment architecture to improve performance and increase our scalability. 19
Read-Only Replica Technologies Metadata: • p4 replicate with filtering wrappers • Optional p4broker for a transparent • solution Users always point to same P4PORT • Archive Files: • Shared storage with primary server • 20
Sample RO Replica (One Server) 21
Sample RO Replica (2 Servers + Broker) 22
Tools for Metadata Replication Classic journal truncation ( p4d -jj ) • p4jrep (deprecated) • p4 replicate ( New in 2009.2) • p4 export ( New in 2009.2) • 23
Replication Example #1 – to Journal #!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \ -s $REPSTATE \ -J $CHECKPOINT_PREFIX \ -o /p4servers/replica/logs/journal 24
Replication Example #2 – to DBs #!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \ -s $REPSTATE \ -J $CHECKPOINT_PREFIX -k \ p4d -r $P4ROOT_REPLICA -f -b 1 -jrc - 25
Replication Example #3 - Filtering #!/bin/bash P4MASTERPORT=perforce.myco.com:1742 CHECKPOINT_PREFIX=/p4servers/master/checkpoints/myco P4ROOT_REPLICA=/p4servers/replica/root REPSTATE=/p4servers/replica/root/rep.state p4 -p $P4MASTERPORT replicate \ -s $REPSTATE \ -J $CHECKPOINT_PREFIX -k \ grep --line-buffered -v '@db\.have@' |\ p4d -r $P4ROOT_REPLICA -f -b 1 -jrc - 26
Archive File Replication Solutions File level – Rsync/Robocopy • Filesystem or block-level (DRBD, etc.) • Commercial WAN replication solutions • Metadata-driven using p4 export • 27
Replication Race Metadata vs. Archive Files • Which data gets there first? • Perfect Consistency • Could mean a higher recovery point objective • (RPO). Recovery state is clean for all recovered data. • Minimum Data Loss • More metadata is preserved. • p4 verify errors point to lost archive files. • 28
Example 1: Classic DR Pre-2009.2 Servers • Classic Journal Truncation • Commercial WAN replication technology • Relaxed 8 hour recovery point objective • (RPO) 29
Example 1: Classic DR 30
Example 1: Classic DR Core approach was very straightforward: On the primary server • Run p4d -jj every 8 hours • Deposit journal files on same volume as • archive files (gaining the benefit of free file transfer) On the DR server • Replay outstanding journals using p4d – jr • Perforce instance on spare always up • Its daily job is running p4 verify • 31
Example 2: Real-Time Replication Suitable for HA or DR • Using p4 replicate • Wraps the p4 replicate utility • Replication engine runs continuously • Leave changes in journal for later replay, or • Replay changes directly to replica P4ROOT • Recovery Point Objective (RPO): • As low as 2 seconds for metadata. • WAN replication for archive files • 32
Example 2: Real-Time Replication 33
Failover Automation Only automate tasks behind FAILOVER button • Allow only a trained Perforce administrator to • push the button. 34
Failover Automation 35 35
Failover Automation Perforce is not a cluster-aware application • Clustering adds some value • Simplifies automation of • DNS switchover • SAN mount transfers • etc. • Offline checkpoints can be beneficial • After failover, db.* files may be in an unknown • state 36
Just A Bit More About Failover It’s Complicated! • Simulation of hardware failures is non-trivial • There is a limit to how much confidence you • should gain from testing. No substitute for a trained administrator • Can analyze failures • Determine the best course of action • 37
Example 3: Read-only Replica Use Filtered Replication • Basic grep (with line buffering) • For filtering one-liner journal entries like • db.have More sophisticated filtering • Needed for journal entries that span • multiple lines Perforce Public Depot has a good example: • //guest/michael_shields/src/p4jrep/awkfilter.sh 38
Example 3: Read-only Replica For Continuous Integration/Build Farms • Define how users will connect to the Replica • Simple (for administrators): • Modify build scripts to use appropriate P4PORT values • Point users at appropriate P4PORT depending on task • Simple (for end users): • All users use p4broker P4PORT • p4broker routes requests to appropriate server • instance Ether the live server or the read-only replica • 39
Example 3: Read-only Replica Make Archive Files Available on Replica • Multiple Server Machines, Master & Replica • Use a SAN or other shared storage solution • Files mounted read-only on the replica • Run Replica instance on Primary server • Works if hardware is powerful enough • Run replica under different login • Cannot write to the archived files • 40
Review of RO Replica 41
Summary Advanced replication solutions • Easier with p4 replicate and p4 export • Typical Uses: • High Availability • Disaster Recovery • Read-only Replicas • Perforce Technical Support can help! • Perforce Consulting can help, too! • 42
Demo 43
Q & A 44
Recommend
More recommend