Catalyst: Unlocking Power of Choice to Speed up Network Updates Rohan Gandhi, Ori Rottenstreich, Xin Jin 1
Network Update Cycle Network Controller Network Updater Target state Compute network Update network target state to target state Centralized Controller Target state Current state (Rules) (Tra ffi c, topology etc.) 2
Network Update Requirements Fast Consistent • Reduce failure impact • No congestion • Improve network optimality • No tra ffi c blackhole • No loop 3
Dependency Graph in Existing Network Updaters Initial State Target State f 5 :5 f 5 :5 S1 S2 S3 S1 S2 S3 f 2 :5 f 2 :5 f 1 :4 f 6 :6 f 6 :6 f 4 :6 f 3 :4 f 4 :6 f 1 :4 f 3 :4 S4 S5 S4 S5 Link capacity = 10 units Naive (one-shot) update plan Consistent network update plan (Can cause inconsistencies) Mv Mv Mv Mv f1, Start Start f1 f4 f2 f4, f2 B4 (SIGCOMM 2013): Network update takes 3-5X more time than controller 4
Existing network updaters assume target state cannot be changed 5
Limitation 1: Long Update Plans Initial State Target State f 5 :5 f 5 :5 S1 S2 S3 S1 S2 S3 f 2 :5 f 2 :5 f 1 :4 f 6 :6 f 6 :6 f 4 :6 f 3 :4 f 1 :4 f 3 :4 f 4 :6 S4 S5 S4 S5 Mv Mv Mv Update Start f1 f4 f2 plan The controller has no information about the length (longest path) of the dependency graph 6
Limitation 2: Inability to Effectively Tackle Stragglers Straggling switches can take 10x or more time to update rules Even a single update through a straggling switch can delay entire stage in dependency graph Dionysus, SIGCOMM 2014 7
Catalyst • Redundancy in the network o ff ers power of choice • Use power of choice to change the target state and speed up network update 8
Key idea 1: Shorten Dependency Graph Initial State Initial Target State Modified Target State f 5 :5 f 5 :5 f 5 :5 S1 S2 S3 S1 S2 S3 S1 S2 S3 f 2 :5 f 2 :5 f 2 :5 f 6 :6 f 1 :4 f 1 :5 f 6 :6 f 6 :6 f 4 :6 f 3 :4 f 3 :4 f 1 :4 f 3 :4 f 4 :6 f 4 :6 S4 S5 S4 S5 S4 S5 Mv Mv Star Mv Mv Mv Update f1 f4 Start f1 f4 f2 plan Mv f2 (+) Merging stages reduces number of stages, which reduces update time (-) Merging stages increases number of updates in a stage, which increases probability of a straggler Merging stages always reduces update time (proof in paper) 9
Key idea 2: Multiple Paths to Tackle Stragglers Assign single flow to multiple (equally optimal) paths Try all paths for a flow in parallel Straggler S2 S3 Default S6 S1 S2 S3 Ingress Egress S2 S3 Catalyst S4 S5 S4 S5 Update time 10
Key Challenges • Which flows to move to alternate paths? • How many alternate paths to choose? • How to compute alternate paths? • How to schedule alternate paths? 11
Evaluation Setup • Load-balancer in datacenter (similar to “Incremental Consistent Updates”, HotSDN 2013) • Datacenter topology: • 100 ToRs in 50 containers with 10Gbps links • Number of flows = 10K • Load balancer settings: • Assigns flows to replicas uniformly • Initial number of servers = 100 • Update settings: • We fail 2 servers and reassign flows to remaining replicas 12
Evaluation “s” = fraction of redundancy compared to total capacity • (s=0.1) Median improvement in network update latency • (max. Path = 1), improvement = 1.22x • (max. Path = 2), improvement = 1.43x • (s=0.2) Median improvement in network update latency • (max. Path = 1), improvement = 1.32x • (max. Path = 2), improvement = 1.65x Increasing number of paths per flow: (+) more alternatives to tackle straggler (-) less number of flows assigned to alternate paths 13
Conclusion • Existing network update solutions assume inflexibility in changing target state • Limitation 1: Unnecessarily long network update dependency graphs • Limitation 2: Cannot handle stragglers e ff ectively • Catalyst: Speeds up network updates by exploiting redundancy in the network • Merges states in the dependency graph using redundant paths • Assigns redundant paths to individual flows to tackle stragglers • Evaluation using load balancer settings shows speed-up up to 1.65x. 14
Recommend
More recommend