failure localization in all optical networks
play

Failure Localization in All-Optical Networks Jnos Tapolcai - PowerPoint PPT Presentation

Failure Localization in All-Optical Networks Jnos Tapolcai Budapest University of Technology and Economics 1 Motivation The goal is to provide fast link failure (cable cuts) localization in All-Optical Networks Link monitoring a


  1. Failure Localization in All-Optical Networks János Tapolcai Budapest University of Technology and Economics 1

  2. Motivation • The goal is to provide fast link failure (cable cuts) localization in All-Optical Networks • Link monitoring • a naive solution by having an active alarm for each link • the number of monitors is | E| • Alarm storm due to multi-hop lightpaths and multi-layer networks STTL TRNT BSTN MPLS DTRT CHCG CLEV SLKC NYCM DNVR KSCY IPLS SNFC WASH STLS LSVG NSVL LSAN ATLN CHRL TULS ELPS DLLS HSTN NWOR MIAM 2

  3. How to localize failures? • Out-of-the band monitoring • Using dedicated supervisory lightpath • Monitoring-trail/cycle þ Simpler and more reliable implementation þ Fast failure localization û Bandwidth requirements • In-band-monitoring þ Minimal bandwidth requirements • Taping operating connections only û Less precision on failure localization • Combining with out-of-band monitoring • Dealing with imprecision of failure localization 3

  4. Localize Single Link Failure with Monitoring Cycles c 0 Alarm code table c 2 c 1 0 0 - 1 0 0 1 0 - 2 0 1 0 c 1 0 - 3 0 1 1 0 - 3 0 1 1 c 0 3 1 - 2 1 0 0 c 2 1 - 3 1 0 1 1 2 2 - 3 1 1 0 • The network topology is known #monitors= 3 • At least 2-connected Cover length = 9 • The goal is to localize single cable cut • With minimal number of monitors γ * (#monitors) + (total cover length) #monitors ≥ ⎡ log 2 ( #links+1 ) ⎤ • Linear combination of cover length and # of monitors 4

  5. Optical Link Failure Monitoring with Trails (M-Trail) • If a node has degree 2 the neighboring links can not be distinguished with cycles: • Using monitoring-trails instead of cycles T Link t 2 t 1 t 0 c Decimal 2 t 0 (0,1) 1 0 1 5 a (0,2) 1 1 1 7 b (0,3) 1 0 0 4 R 4 (1,2) 0 1 1 3 d t 1 1 No optical (1,3) 1 1 0 6 loopback t 2 (2,4) 0 0 1 1 switching 0 e (3,4) 0 1 0 2 3 (a) m-trail (b) An m-trail solution (c) Alarm code table 5

  6. Optical Link Failure Monitoring with Bi- directional M-Trails (BM-Trails) • N. Harvey, M. Patrascu, Y. Wen, S. Yekhanin, and V. Chan, “Non-Adaptive Fault Diagnosis for All-Optical Networks via Combinatorial Group Testing on Graphs,” in IEEE INFOCOM, 2007, pp. 697–705. • Bm-trail is a connected sub-graph • Euler constraint is relaxed Optical loopback switching 6

  7. Architecture - Summary • A supervisory path (SP) is used to probe status of a group of fibre segments and components • Each SP corresponds to a monitor which may alarm when any irregularity is identified • By collecting all the flooded alarms in a failure event, the network controller can identify the failed SRLG instantly • Objective: achieve fast unambiguous failure localization (UFL) under any shared risk link group (SRLG) failure event | 2011 | RNDM 7

  8. UNAMBIGUOUS FAILURE LOCALIZATION 8

  9. Unambiguous failure localization (UFL) under any shared risk link group (SRLG) failure event • Given: an undirected 2 connected graph 0 1. bm-trail – connected components 2. m-trail – trail (Euler subgraph) 001 010 3. m-cycle – closed trail 011 3 • SRLG: 101 A. Single link 110 B. Dense SRLG: dual, triple link failures 1 2 C. Sparse SRLG: Some multi-link failures 100 • Goal: find a minimum number of m-trail/m- cycle/bm-trail in the graph, such that there are no pair of SRLGs with exactly the same m- trail/m-cycle/bm-trail passing through. #monitors ≥ • Goal: We assign non-zero alarm codes to the ⎡ log (# SRLGs +1) ⎤ links, such that each SRLG has unique alarm code, and the in each bit position the 1 bits form. 9 | 2011 | RNDM

  10. Ring networks single failure • Number of bm-trails is ⎡ #links/2 ⎤ f n e • To distinguish the failure of link e and f we need an bm-trail terminating in node n . • Each bm-trail can terminate at most two nodes, thus 2*[#bmtrails] ≥ [#nodes] 10

  11. More bounds for single link failure • J. Tapolcai1, Bin Wu, Pin-Han Ho, "On Monitoring and Failure Localization in Mesh All-Optical Networks", in IEEE INFOCOM ’09 • J. Tapolcai, B. Wu, and Pin-Han Ho, L. Rónyai, “A Novel Approach for Failure Localization in All-Optical Mesh Networks”, in IEEE/ACM Transactions on Networking , Feb 2011 . • Ring topology: #mtrails= ⎡ #links/2 ⎤ • Well-connected topologies (e.g. complete graph): • Decompose the graph into disjoint spanning trees and code them separately • Nash-Williams and Tutte: a 2 k connected graph has k disjoint spanning tree #bm-trails= ⎡ log 2 ( #links +1 ) ⎤ = k 11 | 2011 | RNDM

  12. Nagyon összefügg ő gráf • 2 ⎡ log(élszám+1) ⎤ összefügg ő gráf (m-tree) • Monitorok száma = ⎡ log( élszám+1 ) ⎤ • Nash-Williams és Tutte tétele: minden 2 k él-összefügg ő gráf k diszjunkt feszít ő fát tartalmaz • 2 ⎡ log(élszám+1) ⎤ összefügg ő 12 | 2011 | RNDM

  13. Nagyon összefügg ő gráf • b = ⎡ log(élszám+1) ⎤ független feszít ő fa • i. feszít ő fához rendelt kódban az i . bit 1 • Ekkor az i . bithez tartozó élek garantáltan összefügg ő ek lesznek (s ő t kifeszítik az egész gráfot) • A b hosszú bináris kódokat b vödörbe csoportosítjuk, és minden vödörben legalább és legfeljebb kód kerül. • Indukció: rekurzív konstrukció • b =1,2 jó • b -re van megoldásunk 2 1 b 13 | 2011 | RNDM

  14. Nagyon összefügg ő gráf • Csapjunk a végére 0 bitet b+1 bites kódjaink 1 2 b b+1 • Csapjunk a végére 1 bitet a maradék b+1 bites kódjaink • A második csoportból tegyünk át megfelel ő darab kódot az utolsó vödörbe. • Ha b ≥ 3 a független feszít ő fák miatt igaz • Teljes gráfra igaz ha V ≥ 18 14

  15. Analyzing Different Network Topologies • Randomly generated 5320 network topology • with 20, 30, 40, 50, 60 nodes • Ring networks and randomly adding chords • 30 random graph series • In order to achieve 95% confidence interwal 15

  16. Simulation Results • The m-trails are calculated with heuristics 1770 16

  17. The Concept of Monitoring Trails • Bin Wu, P.-H. Ho, and K. Yeung, “Monitoring trail: a new paradigm for fast link failure localization in WDM mesh networks,” in IEEE GLOBECOM ’08 • The problem has been formulated as an Integer Linear Program (ILP) 9 1 0 16 17 8 9 1 0 16 17 7 8 7 8 9 1 0 16 17 7 7 8 9 1 0 16 17 0 0 0 0 11 3 11 3 11 3 3 11 1 1 1 1 6 6 12 6 12 6 12 12 20 20 20 20 13 13 13 13 4 4 4 4 2 5 18 19 19 2 5 18 19 18 19 14 15 2 5 14 15 18 14 15 2 5 14 15 t 2 t 3 t 0 t 1 7 8 9 1 0 16 17 8 9 1 0 16 17 7 8 9 1 0 16 17 7 8 9 1 0 16 17 0 7 0 0 0 3 11 11 3 11 3 11 3 1 1 1 6 1 6 12 6 12 12 6 12 20 20 20 20 13 13 13 13 4 4 4 4 19 2 5 18 19 19 2 5 14 15 18 2 5 18 19 14 15 2 5 14 15 18 14 15 t 6 t 4 t 7 t 5 9 1 0 16 17 9 1 0 16 17 7 8 7 8 7 8 9 1 0 16 17 0 0 0 3 11 3 11 3 11 1 ILP running *me= 9573.47 sec ~ 2:30hours 1 1 6 6 12 6 12 12 20 20 20 13 13 13 4 4 4 2 5 18 19 2 5 18 19 18 19 14 15 14 15 2 5 14 15 Gap to the op*mality = 20.41% t 10 t 8 t 9 γ =5 #monitors = 11 Total cost =98 where 17

  18. The Heuristic Algorithm I. Constraint 1: Every link most have a unique alarm code Unambiguous Failure Localization (UFL) • Constraint 2: The ”1” bits at each bit-position must form a trail • S. Ahuja, S. Ramasubramanian, and M. Krunz, “SRLG Failure Localization in All-Optical Networks Using Monitoring Cycles and Paths,” in IEEE INFOCOM ‘08 The heuristic is proposed a structure where constraint 2 is • ensured and the goal is to fulfill constraint 1. Cycle Accumulation (CA) • • Our concept is provide a structure where constraint 1 is ensured and our goal is to fulfill constraint 2. • Much faster for minimizing the m-trails 18

  19. The Heuristic Algorithm II. This link has no pair. • Randomly generate unique alarm code for each link 0111 • For each bit position we treat the 0 0 1 1 1 0 0001 0 1 m-trail shaping problem separately 0010 • We start with the smallest bit 1110 position and mark the links that has bit „1” at that position 0 1 1 0 • The goal is to shape it as a trail • Each link has a pair for bit position i • The binary alarm codes are the same except at bit position i • One of the links is marked the other is not • If we change the alarm codes assigned to these links there would be no change in other positions • Some links might not have a pair • Because 0000 alarm code can not be chosen (valid for 1 link only) • Its code pair was not assigned to any link (don’t care links) 19

  20. The Heuristic Algorithm III. This link has no pair. • Greedy code swapping : • Based on Euler’s theorem 0111 we try to shape the links 0 0 1 1 1 0 0001 0 1 into a trail • Nodes with odd degree 1 1 1 0010 1 must be reduced to 2 1110 • The edge set must be connected 0 1 1 0 • We repeat it for every bit position • Until trails are shaped for every bits • If we stuck we generate new random codes 20

  21. The performance of the heuristic compared to ILP 21

  22. A Rule of Thumb of Topology Analysis 0 9 1 0 17 7 8 16 • The theoretical minimum ⎡ log( #links+1 ) ⎤ was almost always achieved if the network 3 11 1 had no nodes with degree 2 6 12 20 13 4 • The number of nodes with degree 2 strongly influences the number of m-trails 19 2 5 18 15 14 22

Recommend


More recommend