Cementing High Availability in OpenFlow with RuleBricks Dan Williams & Hani Jamjoom IBM T. J. Watson Research Center
Planning For Failure Shift towards highly elastic applications Failure is elasticty's evil twin e.g., "Chaos Monkeys" by Netflix 2
Focus on One Question How can high availability policies be added to OpenFlow’s forwarding rules?
Assumed Environment Replicas VM 1 VM 2 VM N Controller Flow reassignment if VM 2 fails Active OpenFlow Switch Rule Set Backup plan Flows 4
Goals Embed failure plans in existing environments Limit flow reassignments Limit rule explosion Keep it simple 5
Exploit Two Features Hierarchical structure of OpenFlow’s wild card rules Precedence rule execution 6
Brick Representation Binary Tree Bricks 01* 00* * 1* 0* = * 1* 0* IP address space 00* Brick width signifies address 01* coverage Number of bricks depend on width of bricks 7
Flow Assignment Flows a b c d Flows matching 0* will be assigned 110* to GREY replica 10* 1* 0* * Use color to represent replica assignment 8
Active vs. Backup Rules Flows a b c d e 110* Active rule set 0* 10* 1* 01* 00* * Flows will match top bricks 9
Remember Tetris? 3 operations in RuleBricks 10
VM 3 VM 1 VM 2 3 replicas, note their color 11
#1. Drop Flows matching 000* will be assigned to WHITE replica Flows a b c d 101* 000* 100* 111* 10* 0* 1* 01* 00* * Add VM3, WHITE replica 12
Reassignment on Failure c d Flows a b e 000* 101* 100* 111* 10* 00* * VM2 dies, GREY bricks vanish 13
#2. Insert Flows a b c d e 101* 100* 10* 111* 000* 0* 1* 01* 1* 00* * Set WHITE to take over GREY's portion of 1* 14
#3. Reduce Transformations defragment 10* 101* 100* fragment 101* deduplicate 101* 101* 101* 101* duplicate 101* 101* Use a greedy algorithm that frags/defrags, then dedups starting with least exposed brick 15
Example Start: 1.Fragment: 101* 101* 10* 101* 100* 1* 11* 101* 100* 1* 11* 101* 100* 2.Deduplicate: 3.Defragment: 100* 100* 11* 101* 100* 1* 11* 101* 100* 1* 16
What Can We Do With RuleBricks? Encode Chord's assignment policy Hash-based assignment of nodes on a ring, representing the IP address space When a node is removed, successor node takes over 17
Encode Node Replica Chord ring: RuleBricks: 11* 00* * 01* 10* covers the entire IP address space 18
Adding a Replica active rule 11* 00* 00* * * 01* 10* backup rule Drop active rule bricks Insert backup rule bricks 19
Adding a Third Replica active rules 00* 11* 00* 100* 01* 00* * * 101* * 01* 100* 20
Adding a Virtual Node 00* 01* 00* 11* 00* 100* 01* 00* * * 101* * 01* * 100* 21
Reducing Rules 00* 11* 00* 0* 100* 0* * 101* * 01* * 100* 22
Implementation Implemented RuleBricks in Python Looked at rule explosion for active rule set Looked at implication of having fixed vs. variable size bricks Fixed-sized RuleBricks 01* 00* 01* 00* Variable-sized RuleBricks 23
Brick-based Rule Reduction Use 16 virtual nodes per replica 24
Conclusion Failure is elastcity’s evil twin RuleBricks simplifies planning for failure RuleBricks can embed different flow assignment policies 25
Recommend
More recommend