Concurrent Self-Adjusting Distributed Tree Networks Bruna Peres Stefan Schmid Chen Avin Olga Goussevskaia
Motivation • New technologies allow communication networks to be increasingly flexible and reconfigurable • Traditional networks designs are still optimized toward static metrics DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 2
ProjecToR • ProjecToR: Agile Reconfigurable Data Center Interconnect. Ghobadi et al., SIGCOMM'16 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 3
• Self-Adjusting Data Structures • Self-adjusting networks ↔ self-adjusting data structures request c c x x • Splay Trees c x n z z f f z f t n n h c r v h h t t g j g g j j r r v v DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 4
SplayNets S. Schmid, et al., SplayNet: Towards Locally Self-Adjusting Networks IEEE/ACM Transactions on Networking , 2016. request h x x n n z z request h f f t t c h c h r v r v g j g j DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 5
SplayNets • Distributed tree network • Improves the communication cost between two nodes in a self- adjusting manner • Nodes communicating more frequently become topologically root closer to each other over time LCA (u,v) • Lowest common ancestor LCA(u,v): locality is preserved v u DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 6
Our Contributions • While SplayNets are inherently intended to distributed applications, so far, only sequential algorithms are known to maintain SplayNets • We present DiSplayNets, the first distributed and concurrent implementation of SplayNets DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 7
Model • Network model: • Binary tree 𝑈 comprised of a set of 𝑜 communication nodes • Sequence of communication requests 𝜏 = (𝜏 1 , 𝜏 2 , … , 𝜏 𝑛 ) : • 𝜏 𝑗 = 𝑡, 𝑒 • 𝑢 𝑐 𝜏 𝑗 and 𝑢 𝑓 𝜏 𝑗 • Given 𝜏 𝑗 𝑡, 𝑒 , s and d rotate in parallel towards the LCA(s,d) • LCA might change over time DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 8
DiSplayNet • State machine executed by each node in parallel x Climbing n z Passive Communicating request h f t Waiting c h r v g j DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 9
DiSplayNet • State machine executed by each node in parallel x Climbing t z Passive Communicating v n Waiting f c h g j DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 10
DiSplayNet • State machine executed by each node in parallel x Climbing t z Passive Communicating v h Waiting f n g c j r DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 11
Local reconfigurations z z z w z w w w u v u u v v v T4 T4 u w T1 v v T3 T1 zig-zag u u w T1 T3 zig-zig T2 zig T4 T1 T2 T3 T1 T2 T2 T3 T2 T1 T2 T3 T3 T4 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 12
Local reconfigurations z z w u β (u) v v T4 T1 u w T3 zig-zig T2 T2 T1 T3 T4 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 13
DiSplayNet • In order to ensure deadlock and starvation freedom, concurrent operations are performed according to a priority 𝑢 𝑐 𝜏 𝑗 (𝑡 𝑗 , 𝑒 𝑗 ) < 𝑢 𝑐 𝜏 𝑘 (𝑡 𝑘 , 𝑒 𝑘 ) DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 14
Algorithm • The algorithm is executed in rounds 𝑢 𝑐 𝜏 𝑗 (𝑡 𝑗 , 𝑒 𝑗 ) < 𝑢 𝑐 𝜏 𝑘 (𝑡 𝑘 , 𝑒 𝑘 ) Phase 1 Phase 2 β ( 𝒆 𝒌 ) 𝒆 𝒌 β ( 𝒕 𝒋 ) 𝒕 𝒋 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 15
Algorithm • The algorithm is executed in rounds 𝑢 𝑐 𝜏 𝑗 (𝑡 𝑗 , 𝑒 𝑗 ) < 𝑢 𝑐 𝜏 𝑘 (𝑡 𝑘 , 𝑒 𝑘 ) ack ack Phase 2 𝒆 𝒌 𝒕 𝒋 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 16
Algorithm • The algorithm is executed in rounds 𝑢 𝑐 𝜏 𝑗 (𝑡 𝑗 , 𝑒 𝑗 ) < 𝑢 𝑐 𝜏 𝑘 (𝑡 𝑘 , 𝑒 𝑘 ) Phase 3 𝒆 𝒌 ack 𝒕 𝒋 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 17
Algorithm • The algorithm is executed in rounds 𝑢 𝑐 𝜏 𝑗 (𝑡 𝑗 , 𝑒 𝑗 ) < 𝑢 𝑐 𝜏 𝑘 (𝑡 𝑘 , 𝑒 𝑘 ) Phase 4 𝒕 𝒋 𝒆 𝒌 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 18
Algorithm • The algorithm is executed in rounds 𝑢 𝑐 𝜏 𝑗 (𝑡 𝑗 , 𝑒 𝑗 ) < 𝑢 𝑐 𝜏 𝑘 (𝑡 𝑘 , 𝑒 𝑘 ) Phase 5 𝒕 𝒋 𝒆 𝒌 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 19
DiSplayNets • Self-adjust to the communication pattern in a fully-decentralized manner • Starvation free • Deadlock free DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 20
Future Work • Analyze the efficiency • Work cost: 𝑋 𝐸𝑗𝑇𝑞𝑚𝑏𝑧𝑂𝑓𝑢, 𝑈 0 , 𝜏 = 𝜏 𝑗 𝜗 𝜏 𝑥(𝜏 𝑗 ) • Time cost: • Request delay: 𝑢 𝑒 𝜏 𝑗 = 𝑢 𝑓 𝜏 𝑗 - 𝑢 𝑐 𝜏 𝑗 • Makespan: 𝑈 𝑈 0, 𝜏 = max 𝜏 𝑗 𝜗 𝜏 𝑢 𝑓 𝜏 𝑗 − min 𝜏 𝑗 𝜗 𝜏 𝑢 𝑐 𝜏 𝑗 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 21
Progress Matrix 𝒖 𝒋+𝟐 𝒖 𝒋+𝟓 𝒖 𝒋 𝒖 𝒋+𝟒 𝒖 𝒋+𝟑 DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 22
Progress Matrix Makespan Work DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 23
Future Work • Our simulations show first promising results • ProjecToR data • 128 node randomly selected from 2 production clusters (running a mix of workloads, including MapReduce-type jobs, index builders, and database and storage systems) • 1000 requests • Poisson process DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 24
Future Work • Our simulations show first promising results • Individual work CDF DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 25
Future Work • Our simulations show first promising results • Total work DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 26
Future Work • Our simulations show first promising results • Request delay DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 27
Future Work • Our simulations show first promising results • Makespan DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks 28
Thank you Bruna Peres bperes@dcc.ufmg.br DISC 2017 Concurrent Self-Adjusting Distributed Tree Networks
Recommend
More recommend