network layer routing recap why do we need a network layer
play

Network Layer (Routing) Recap: Why do we need a Network layer? - PowerPoint PPT Presentation

Network Layer (Routing) Recap: Why do we need a Network layer? Internetworking Need to connect different link layer networks Addressing Need a globally unique way to address hosts Routing and forwarding Now Need to


  1. Distance Vector Setting Each node computes its forwarding table in a distributed setting: 1. Nodes know only the cost to their neighbors; not topology 2. Nodes can talk only to their neighbors using messages 3. All nodes run the same algorithm concurrently 4. Nodes and links may fail, messages may be lost CSE 461 University of Washington 44

  2. Distance Vector Algorithm Each node maintains a vector of (distance, next hop) to all destinations 1. Initialize vector with 0 (zero) cost to self, ∞ (infinity) to other destinations 2. Periodically send vector to neighbors 3. Update vector for each destination by selecting the shortest distance heard, after adding cost of neighbor link 4. Use the best neighbor for forwarding CSE 461 University of Washington 45

  3. Distance Vector (2) • Consider from the point of view of node A F • Can only talk to nodes B and E 2 4 To Cost E 3 A 0 Initial G 10 B ∞ 3 2 vector C ∞ 4 D ∞ D 1 E ∞ 4 A B F ∞ 2 2 G ∞ H 3 H ∞ C CSE 461 University of Washington 46

  4. Distance Vector (3) • First exchange with B, E; learn best 1-hop routes F B E B E A’s A’s 2 4 To says says +4 +10 Cost Next E 3 A ∞ ∞ ∞ ∞ 0 -- G 10 B 0 ∞ 4 ∞ 4 B 3 2 C ∞ ∞ ∞ ∞ ∞ -- 4 D 1 D ∞ ∞ ∞ ∞ ∞ -- 4 E ∞ 0 ∞ 10 10 E A B 2 2 F ∞ ∞ ∞ ∞ ∞ -- G ∞ ∞ ∞ ∞ ∞ -- H 3 C H ∞ ∞ ∞ ∞ ∞ -- Learned better route 47

  5. Distance Vector (4) • Second exchange; learn best 2-hop routes F B E B E A’s A’s 2 4 To says says +4 +10 Cost Next E 3 A 4 10 8 20 0 -- G 10 B 0 4 4 14 4 B 3 2 C 2 1 6 11 6 B 4 D 1 D ∞ 2 ∞ 12 12 E 4 E 4 0 8 10 8 B A B 2 2 F 3 2 7 12 7 B G 3 ∞ 7 ∞ 7 B H 3 C H ∞ ∞ ∞ ∞ ∞ -- CSE 461 University of Washington 48

  6. Distance Vector (4) • Third exchange; learn best 3-hop routes F B E B E A’s A’s 2 4 To says says +4 +10 Cost Next E 3 A 4 8 8 18 0 -- G 10 B 0 3 4 13 4 B 3 2 C 2 1 6 11 6 B 4 D 1 D 4 2 8 12 8 B 4 E 3 0 7 10 7 B A B 2 2 F 3 2 7 12 7 B G 3 6 7 16 7 B H 3 C H 5 4 9 14 9 B CSE 461 University of Washington 49

  7. Distance Vector (5) • Subsequent exchanges; converged F B E B E A’s A’s 2 4 To says says +4 +10 Cost Next E 3 A 4 7 8 17 0 -- G 10 B 0 3 4 13 4 B 3 2 C 2 1 6 11 6 B 4 D 1 D 4 2 8 12 8 B 4 E 3 0 7 10 8 B A B 2 2 F 3 2 7 12 7 B G 3 6 7 16 7 B H 3 C H 5 4 9 14 9 B CSE 461 University of Washington 50

  8. Distance Vector Dynamics • Adding routes: • News travels one hop per exchange • Removing routes: • When a node fails, no more exchanges, other nodes forget Problem? CSE 461 University of Washington 51

  9. Count to Infinity: Problem • Good news travels quickly, bad news slowly (inferred) X Desired convergence “Count to infinity” scenario CSE 461 University of Washington 52

  10. Count to Infinity: Heuristics • “Split horizon” • Don’t send route back to where you learned it from. • Poison reverse • Send “infinity” when you notice a disconnect X X CSE 461 University of Washington 53

  11. Count to Infinity: Heuristics (2) • Neither split horizon and poison reverse are very effective in practice • Link state is now favored except when resource-limited CSE 461 University of Washington 54

  12. RIP (Routing Information Protocol) • DV protocol with hop count as metric • Infinity is 16 hops; limits network size • Includes split horizon, poison reverse • Routers send vectors every 30 seconds • Runs on top of UDP • Time-out in 180 secs to detect failures • RIPv1 specified in RFC1058 (1988) CSE 461 University of Washington 55

  13. Link-State Routing

  14. Link-State Routing • Second broad class of routing algorithms • More computation than DV but better dynamics • Widely used in practice • Used in Internet/ARPANET from 1979 • Modern networks use OSPF (L3) and IS-IS (L2) CSE 461 University of Washington 57

  15. Link-State Setting Same distributed setting as for distance vector: 1. Nodes know only the cost to their neighbors; not topology 2. Nodes can talk only to their neighbors using messages 3. All nodes run the same algorithm concurrently 4. Nodes/links may fail, messages may be lost CSE 461 University of Washington 58

  16. Link-State Algorithm Proceeds in two phases: 1. Nodes flood topology with link state packets Each node learns full topology • 2. Each node computes its own forwarding table By running Dijkstra (or equivalent) • CSE 461 University of Washington 59

  17. Part 1: Flood Routing

  18. Flooding • Rule used at each node: • Sends an incoming message on to all other neighbors • Remember the message so that it is only flood once CSE 461 University of Washington 61

  19. Flooding (2) • Consider a flood from A; first reaches B via AB, E via F AE E G D A B H C CSE 461 University of Washington 62

  20. Flooding (3) • Next B floods BC, BE, BF, BG, and E floods EB, EC, ED, F EF E and B send to E each other G D A B H C CSE 461 University of Washington 63

  21. Flooding (4) • C floods CD, CH; D floods DC; F floods FG; G floods GF F F gets another copy E G D A B H C 64

  22. Flooding (5) • H has no-one to flood … and we’re done F Each link carries the message, and in at least one direction E G D A B H C CSE 461 University of Washington 65

  23. Flooding Details • Remember message (to stop flood) using source and sequence number • So next message (with higher sequence) will go through • To make flooding reliable, use ARQ • So receiver acknowledges, and sender resends if needed Problem? CSE 461 University of Washington 66

  24. Flooding Problem • F receives the same message multiple times F E and B send to E each other too G D A B H C CSE 461 University of Washington 67

  25. Part 2: Dijkstra’s Algorithm

  26. Edsger W. Dijkstra (1930-2002) • Famous computer scientist • Programming languages • Distributed algorithms • Program verification • Dijkstra’s algorithm, 1969 • Single-source shortest paths, given network with non-negative link costs By Hamilton Richards, CC-BY-SA-3.0, via Wikimedia Commons CSE 461 University of Washington 69

  27. Dijkstra’s Algorithm Algorithm : • Mark all nodes tentative, set distances from source to 0 (zero) for source, and ∞ (infinity) for all other nodes • While tentative nodes remain: • Extract N, a node with lowest distance • Add link to N to the shortest path tree • Relax the distances of neighbors of N by lowering any better distance estimates CSE 461 University of Washington 70

  28. Dijkstra’s Algorithm (2) • Initialization F ∞ 2 4 E ∞ ∞ 3 G 10 3 2 ∞ 4 0 D 1 ∞ 4 A B We’ll compute 2 2 shortest paths ∞ H 3 ∞ C from A CSE 461 University of Washington 71

  29. Dijkstra’s Algorithm (3) • Relax around A F ∞ 2 4 E ∞ 10 3 G 10 3 2 ∞ 4 0 D 1 4 4 A B 2 2 ∞ H 3 ∞ C CSE 461 University of Washington 72

  30. Dijkstra’s Algorithm (4) Distance fell! • Relax around B F 7 2 4 E 7 8 3 G 10 3 2 ∞ 4 0 D 1 4 4 A B 2 2 H 6 3 ∞ C CSE 461 University of Washington 73

  31. Dijkstra’s Algorithm (5) Distance fell • Relax around C F 7 again! 2 4 E 7 7 3 G 10 3 2 8 4 0 D 1 4 4 A B 2 2 H 6 3 C 9 CSE 461 University of Washington 74

  32. Dijkstra’s Algorithm (6) Didn’t fall … • Relax around G (say) F 7 2 4 E 7 7 3 G 10 3 2 8 4 0 D 1 4 4 A B 2 2 H 6 3 C 9 CSE 461 University of Washington 75

  33. Dijkstra’s Algorithm (7) Relax has no effect • Relax around F (say) F 7 2 4 E 7 7 3 G 10 3 2 8 4 0 D 1 4 4 A B 2 2 H 6 3 C 9 CSE 461 University of Washington 76

  34. Dijkstra’s Algorithm (8) • Relax around E F 7 2 4 E 7 7 3 G 10 3 2 8 4 0 D 1 4 4 A B 2 2 H 6 3 C 9 CSE 461 University of Washington 77

  35. Dijkstra’s Algorithm (9) • Relax around D F 7 2 4 E 7 7 3 G 10 3 2 8 4 0 D 1 4 4 A B 2 2 H 6 3 C 9 CSE 461 University of Washington 78

  36. Dijkstra’s Algorithm (10) • Finally, H … done F 7 2 4 E 7 7 3 G 10 3 2 8 4 0 D 1 4 4 A B 2 2 H 6 3 C 9 CSE 461 University of Washington 79

  37. Dijkstra Comments • Finds shortest paths in order of increasing distance from source • Leverages optimality property • Runtime depends on cost of extracting min-cost node • Superlinear in network size (grows fast) • Using Fibonacci Heaps the complexity turns out to be O(|E|+|V|log| V|) • Gives complete source/sink tree • More than needed for forwarding! • But requires complete topology CSE 461 University of Washington 80

  38. Bringing it all together…

  39. Phase 1: Topology Dissemination • Each node floods link state packet (LSP) that describes their portion of F the topology 2 4 E 3 Node E’s LSP G 10 Seq. # 3 2 flooded to A, B, A 10 4 B 4 C, D, and F D 1 C 1 4 A B D 2 2 2 F 2 H 3 C CSE 461 University of Washington 82

  40. Phase 2: Route Computation • Each node has full topology • By combining all LSPs • Each node simply runs Dijkstra • Replicated computation, but finds required routes directly • Compile forwarding table from sink/source tree • That’s it folks! CSE 461 University of Washington 83

  41. Forwarding Table Source Tree for E (from Dijkstra) E’s Forwarding Table F To Next A C 2 4 B C E 3 C C G 10 D D 3 2 E -- 4 D F F 1 G F 4 A B H C 2 2 H 3 C CSE 461 University of Washington 84

  42. Handling Changes • On change, flood updated LSPs, re-compute routes • E.g., nodes adjacent to failed link or node initiate F F’s LSP B’s LSP Failure! 2 4 Seq. # Seq. # E 3 A 4 B 3 XXXX G 10 C 2 E 2 3 2 ∞ E 4 G 4 D 1 F 3 4 A B ∞ G 2 2 H 3 C CSE 461 University of Washington 85

  43. Handling Changes (2) • Link failure • Both nodes notice, send updated LSPs • Link is removed from topology • Node failure • All neighbors notice a link has failed • Failed node can’t update its own LSP • But it is OK: all links to node removed CSE 461 University of Washington 86

  44. Handling Changes (3) • Addition of a link or node • Add LSP of new node to topology • Old LSPs are updated with new link • Additions are the easy case … CSE 461 University of Washington 87

  45. Link-State Complications • Things that can go wrong: • Seq. number reaches max, or is corrupted • Node crashes and loses seq. number • Network partitions then heals • Strategy: • Include age on LSPs and forget old information that is not refreshed • Much of the complexity is due to handling corner cases CSE 461 University of Washington 88

  46. DV/LS Comparison Goal Distance Vector Link-State Correctness Distributed Bellman-Ford Replicated Dijkstra Efficient paths Approx. with shortest paths Approx. with shortest paths Fair paths Approx. with shortest paths Approx. with shortest paths Fast convergence Slow – many exchanges Fast – flood and compute Scalability Excellent – storage/compute Moderate – storage/compute CSE 461 University of Washington 89

  47. IS-IS and OSPF Protocols • Widely used in large enterprise and ISP networks • IS-IS = Intermediate System to Intermediate System • OSPF = Open Shortest Path First • Link-state protocol with many added features • E.g., “Areas” for scalability CSE 461 University of Washington 90

  48. Equal-Cost Multi-Path Routing

  49. Multipath Routing • Allow multiple routing paths from node to destination be used at once • Topology has them for redundancy • Using them can improve performance • Questions: • How do we find multiple paths? • How do we send traffic along them? CSE 461 University of Washington 92

  50. Equal-Cost Multipath Routes • One form of multipath routing F • Extends shortest path model by 2 4 keeping set if there are ties E 3 G 10 • Consider A à E 3 1 4 • ABE = 4 + 4 = 8 2 D • ABCE = 4 + 2 + 2 = 8 4 A B 1 2 • ABCDE = 4 + 2 + 1 + 1 = 8 H • Use them all! 3 C CSE 461 University of Washington 93

  51. Source “Trees” • With ECMP, source/sink “tree” is a directed acyclic graph (DAG) • Each node has set of next hops • Still a compact representation Tree DAG CSE 461 University of Washington 94

  52. Source “Trees” (2) F • Find the source “tree” for E 2 4 • Procedure is Dijkstra, simply E 3 remember set of next hops G 10 3 1 • Compile forwarding table similarly, 4 may have set of next hops 2 D 4 A B • Straightforward to extend DV too 1 2 H • Just remember set of neighbors 3 C CSE 461 University of Washington 95

  53. Source “Trees” (3) Source Tree for E E’s Forwarding Table F 2 New for 4 Node Next hops E A B, C, D ECMP 3 G B B, C, D 10 3 1 C C, D D D 4 2 D E -- 4 A B F F 1 2 G F H C, D H 3 C CSE 461 University of Washington 96

  54. Forwarding with ECMP • Could randomly pick a next hop for each packet based on destination • Balances load, but adds jitter • Instead, try to send packets from a given source/destination pair on the same path • Source/destination pair is called a flow • Map flow identifier to single next hop • No jitter within flow, but less balanced CSE 461 University of Washington 97

  55. Forwarding with ECMP (2) Multipath routes from F/E to C/H E’s Forwarding Choices F Possible Example Flow 2 4 next hops choice E F à H C, D D 3 F à C C, D D G 10 3 1 E à H C, D C E à C C, D C 4 D 2 4 A B Use both paths to get 1 2 to one destination H 3 C CSE 461 University of Washington 98

  56. Border Gateway Protocol (BGP)

  57. Structure of the Internet • Networks (ISPs, CDNs, etc.) group with IP prefixes • Networks are richly interconnected, often using IXPs Prefix B1 Prefix D1 Prefix C1 ISP B CDN D IXP CDN C IXP Prefix E1 Prefix A1 Net E IXP IXP Net F ISP A Prefix E2 Prefix A2 Prefix F1

  58. Internet-wide Routing Issues • Two problems beyond routing within a network 1. Scaling to very large networks • Techniques of IP prefixes, hierarchy, prefix aggregation 2. Incorporating policy decisions • Letting different parties choose their routes to suit their own needs Yikes! CSE 461 University of Washington 101

  59. Effects of Independent Parties • Each party selects routes to ISP A ISP B suit its own interests Prefix A1 Prefix B1 • e.g, shortest path in ISP • What path will be chosen Prefix A2 for A2 à B1 and B1 à A2? Prefix B2 • What is the best path? CSE 461 University of Washington 102

  60. Effects of Independent Parties (2) • Selected paths are longer ISP A ISP B than overall shortest path Prefix A1 Prefix B1 • And asymmetric too! • Consequence of independent goals and Prefix A2 decisions, not hierarchy Prefix B2 CSE 461 University of Washington 103

Recommend


More recommend