� � � ✁ ✂ ✁ � ✁ ✄ ✁ ✁ Reliable Multicast − topics � critical applications may require some guarantees about the delivery of messages to the group members financial transactions, monitoring and management of industrial plants, file transfer, conference... � reliable multicast in real systems: SRM protocol � what does " reliable multicast " really mean? formal problem definition ; hierarchy of problems � how do failures affect reliable transmission? definition of (hierarchy of) failure models � example algorithms to solve the reliable multicast problem Rossi − Pagani A.A. 2003−2004 SRM Reliable Multicast − introduction TCP supplies reliable e2e unicast transport service more than reliable.... connection−oriented! unsuitable for multicast: heterogeneous recipients recipients may join/leave the session at different times membership monitoring for connection opening/closing decide whether joining receivers should start receiving data from the beginning of the transmission or not virtual synchrony : 1 tick every group membership change (network) failures could affect several (neighbor) recipients at one time Rossi − Pagani A.A. 2003−2004 SRM
� � ✁ ✁ � ✁ ✄ ✁ � � ✁ ✁ ✁ � � Reliable Multicast − introduction recipients could receive different sets of messages and have different requirements for congestion control ACKs implosion at the sender monitoring of the reception state for each recipient: n windows estimation of the round−trip delay for each recipient hence: appropriate original protocols are needed receiver−driven approach no ACKs : receivers ask for lost messages under the assumption that losses are not frequent! Rossi − Pagani A.A. 2003−2004 SRM S calable Reliable Multicast compliance with/exploitation of the TCP/IP stack minimal: " eventual delivery of all the data to all the group members" more complex problems (namely, ordering) left to upper layers Warning ! Which processes are group members?! parametrized: performance can be optimized depending on the application communication pattern and semantics adaptive algorithm for unknown topology or changing mship no knowledge of the group membership or the src’s identity Rossi − Pagani A.A. 2003−2004 SRM
✄ ✁ ✁ ✁ ✁ ☎ ✁ ✄ � ✁ ☎ ☎ � ✁ � � ✁ System model sources and recipients belong to the same group G naming of the data units (persistent) no wrap around problem as with units numbering applications such that operations are idempotent reception of duplicate msgs doesn’t jeopardize the application /* duplicate filtering easy to add */ IP multicast available; clocks synchronized via NTP symmetric paths assumed to estimate the round−trip time Rossi − Pagani A.A. 2003−2004 SRM SRM: communication pattern let d_XY be the e2e delay between two nodes X and Y rcvr A detects a lost msg m generated by the source S set random request timer tq_A in [C1 d_SA, (C1+C2) d_SA] if (req received for m from C before the timer expiration) then suppress your own request; tq_A = 2 tq_A ; wait for the reply else multicast request and wait (2 tq_A) for the reply B such that it has received m and receives a repair request for m from A set random repair timer in [D1 d_AB, (D1+D2) d_AB] if (repair received for m from other node) then suppress repair else multicast repair; ignore further requests for 3 d_SB Rossi − Pagani A.A. 2003−2004 SRM
✁ ✁ � � � � ✁ � ✁ ✁ � ✁ � ✁ SRM: discussion lost msgs ( last ACK) detected by exchanging session msgs Periodical state report (as RTCP) , also used to estimate e2e delay wait before sending request: duplicate suppression if other reqs multicast earlier, request timer increased to reduce duplication probability same mechanism to suppress duplicate repairs every node can repair the loss: load distribution successive requests temporarily ignored to overcome network transmission delay (request sent while repair is on the way) Rossi − Pagani A.A. 2003−2004 SRM SRM: discussion duplicate suppression reduces communication o/h the longer a node waits before sending req, the more efficient long wait negatively affects repair promptness C1, C2, D1, D2 values affect the network performance high C1 : longer wait before repair ; high C2 : lower probability of duplicate requests /* the same for D1, D2 */ for regular topologies, optimal values can be found in the sequel we assume uniform topology: all links with delay 1 Rossi − Pagani A.A. 2003−2004 SRM
☎ ✝ ☎ ☎ ✆ ☎ ✆ ☎ ✆ ☎ ✆ ☎ ☎ Optimization for bus topology: deterministic suppression 1 3 4 5 6 7 8 2 X loss src t+1 t+4 t t+2 t+3 t+5 detected t+2 req t+4 repair C1=D1=1; C2=D2=0 : all duplicate reqs/repairs suppressed 1st node A after the loss point sends req at t + d_SA 1st node before the loss point replies at t + d_SA +2 R_k repairs at t+k+2+d_SA rather than at t+2d_SA+3k Rossi − Pagani A.A. 2003−2004 SRM Optimization for star topology: probabilistic suppression 2 src #reqs scheduled in [t1, t2] are (G−1) (t2−t1)/(C2 d ) /* uniform */ 6 3 X 1 1st req scheduled at d C2/(G− 1) 4 5 sent only the reqs scheduled in for all X, Y d_XY = 2 = d [ d C2/(G−1), d C2/(G−1) + d ] C1=D1=0; C2=D2 >= 1 #sent requests 1 + (G−2)/C2 all nodes notice loss at t (G−2)*[ d C2/(G−1) + d − d C2/(G−1)]/( d C2) = (G−2) * (d /( d C2)) = (G−2)/C2 1st req sent at t+x : suppressed the higher C2, the lower the # of all reqs scheduled in [t+x+ d , C2 d ] duplicates, and the higher the repair delay Rossi − Pagani A.A. 2003−2004 SRM
� � � � � � � � � � � Optimization for tree topology intermediate between bus downstream node B and star detects loss at t+j; tq_B S expires not before than t+j+( d +j)C1 A req of a downstream node B C is suppressed when t+ d C1+ d U[C2]+j <= tq_A in [t+ d C1, t+j+( d +j)C1+( d +j)U[C2] t+ d C1+ d C2] req suppressed if downstream node B such d C2/C1<=j that d_AB=j receives A’s req at t+ d C1+ d C2+j at the smaller C2/C1 , the most higher the # of suppressed reqs Rossi − Pagani A.A. 2003−2004 SRM Adaptive algorithm if unknown topology, difficult to estimate optimal C1, C2 IDEA : if high # duplicate reqs then increase timer interval if low # duplicate reqs then decrease timer interval /* to increase repair promptness */ nodes close to both the loss point and the source should have lower C1 and C2 than other recipients dynamic adaptation allows to trace both traffic congestion and group membership dynamics parameters updated upon request timer expiration or reset Rossi − Pagani A.A. 2003−2004 SRM
✞ � � � � � � � � � � � � ✞ Adaptive algorithm: variables request_period = time between two successive tq settings ave_req_del = average delay between timer set and reset # duplicates estimated via an exponential−weighted average: ave_dup_req = (1− α )ave_dup_req + α #_ dup_req ave_dup_req = average # duplicate reqs between two successive timer settings AveDup , AveDelay = upper bounds on the # of duplicates and the repair delay request from A carries d_SA Rossi − Pagani A.A. 2003−2004 SRM Adaptive algorithm: pseudocode update ave_req_delay ; update ave_dup_req if (sent request) decrease C1 if (received req from recipients farer from the src than the current node) decrease C2 else if ( ave_dup_req > AveDup ) increase both C1 and /* above all! */ C2 else if ( ave_dup_req < AveDup− ε ) if ( ave_req_del > AveDelay ) decrease C2 if ( ave_dup_req < α ) decrease C1 else increase C1 /* AveDup− ε <=ave_dup_req<=AveDup */ Rossi − Pagani A.A. 2003−2004 SRM
✄ � ✁ ✄ ✁ � � � � � � Adaptive algorithm: discussion problem: how much should C1 and C2 be increased or decreased? /* oscillations */ experiments show that the adaptive algorithm decreases the # duplicate repairs w.r.t. the non−adaptive algorithm, but has more variable repair delay (competitive w.r.t. TCP) the choice of AveDup and AveDelay allows to characterize the tradeoff between duplicate suppression and repair promptness, depending on the application semantics problem: how should timers be set if multiple failures? Rossi − Pagani A.A. 2003−2004 SRM Concluding remarks parameters optimization may be a problem example usage: BGP: reliable distribution of the routing information SRM avoids to establish and maintain O(n^2) connections news distribution, web mirrors: delay insensitive optimization w.r.t. duplicate suppression applications available that make use of SRM (e.g. whiteboard) Rossi − Pagani A.A. 2003−2004 SRM
Recommend
More recommend