chapter 4 network layer
play

Chapter 4: Network Layer Chapter goals: understand principles - PowerPoint PPT Presentation

Chapter 4: Network Layer Chapter goals: understand principles behind network layer services: routing (path selection) dealing with scale how a router works instantiation and implementation in the Internet Network Layer 4-1


  1. Subnets  IP address: 223.1.1.1  subnet part (high 223.1.2.1 223.1.1.2 order bits) 223.1.2.9 223.1.1.4  host part (low order bits) 223.1.2.2 223.1.1.3 223.1.3.27  What ’ s a subnet ? LAN  device interfaces with same subnet part of IP 223.1.3.2 223.1.3.1 address  can physically reach each other without network consisting of 3 subnets intervening router Network Layer 4-23

  2. Subnets 223.1.1.0/24 223.1.2.0/24 Recipe  To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet. 223.1.3.0/24 Subnet mask: /24 Network Layer 4-24

  3. Subnets 223.1.1.2 How many? 223.1.1.1 223.1.1.4 223.1.1.3 223.1.7.0 223.1.9.2 223.1.9.1 223.1.7.1 223.1.8.1 223.1.8.0 223.1.2.6 223.1.3.27 223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2 Network Layer 4-25

  4. IP addressing: CIDR CIDR: Classless InterDomain Routing  subnet portion of address of arbitrary length  address format: a.b.c.d/x, where x is # bits in subnet portion of address host subnet part part 11001000 00010111 00010000 00000000 200.23.16.0/23 Network Layer 4-26

  5. IP addresses: how to get one? Q: How does host get IP address?  hard-coded by system admin in a file  Wintel: control-panel->network->configuration- >tcp/ip->properties  UNIX: /etc/rc.config  DHCP: Dynamic Host Configuration Protocol: dynamically get address from as server  “ plug-and-play ” (more in next chapter) Network Layer 4-27

  6. IP addresses: how to get one? Q: How does network get subnet part of IP addr? A: gets allocated portion of its provider ISP ’ s address space ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 ... … .. … . … . Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 Network Layer 4-28

  7. Hierarchical addressing: route aggregation Hierarchical addressing allows efficient advertisement of routing information: Organization 0 200.23.16.0/23 Organization 1 “ Send me anything 200.23.18.0/23 with addresses beginning Organization 2 . 200.23.16.0/20 ” 200.23.20.0/23 Fly-By-Night-ISP . . . . Internet . Organization 7 200.23.30.0/23 “ Send me anything ISPs-R-Us with addresses beginning 199.31.0.0/16 ” Network Layer 4-29

  8. Hierarchical addressing: more specific routes ISPs-R-Us has a more specific route to Organization 1 Organization 0 200.23.16.0/23 “ Send me anything with addresses beginning Organization 2 . 200.23.16.0/20 ” 200.23.20.0/23 Fly-By-Night-ISP . . . . Internet . Organization 7 200.23.30.0/23 “ Send me anything ISPs-R-Us with addresses beginning 199.31.0.0/16 Organization 1 or 200.23.18.0/23 ” 200.23.18.0/23 Network Layer 4-30

  9. IP addressing: the last word... Q: How does an ISP get block of addresses? A: ICANN: Internet Corporation for Assigned Names and Numbers  allocates addresses  manages DNS  assigns domain names, resolves disputes Network Layer 4-31

  10. NAT: Network Address Translation rest of local network Internet (e.g., home network) 10.0.0.1 10.0.0/24 10.0.0.4 10.0.0.2 138.76.29.7 10.0.0.3 All datagrams leaving local Datagrams with source or network have same single source destination in this network NAT IP address: 138.76.29.7, have 10.0.0/24 address for different source port numbers source, destination (as usual) Network Layer 4-32

  11. NAT: Network Address Translation  Motivation: local network uses just one IP address as far as outside word is concerned:  no need to be allocated range of addresses from ISP: - just one IP address is used for all devices  can change addresses of devices in local network without notifying outside world  can change ISP without changing addresses of devices in local network  devices inside local net not explicitly addressable, visible by outside world (a security plus). Network Layer 4-33

  12. NAT: Network Address Translation Implementation: NAT router must:  outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr.  remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair  incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table Network Layer 4-34

  13. NAT: Network Address Translation NAT translation table 1: host 10.0.0.1 2: NAT router WAN side addr LAN side addr sends datagram to changes datagram 138.76.29.7, 5001 10.0.0.1, 3345 128.119.40, 80 source addr from …… …… 10.0.0.1, 3345 to 138.76.29.7, 5001, S: 10.0.0.1, 3345 updates table D: 128.119.40.186, 80 10.0.0.1 1 S: 138.76.29.7, 5001 2 10.0.0.4 D: 128.119.40.186, 80 10.0.0.2 138.76.29.7 S: 128.119.40.186, 80 4 D: 10.0.0.1, 3345 S: 128.119.40.186, 80 3 10.0.0.3 D: 138.76.29.7, 5001 4: NAT router 3: Reply arrives changes datagram dest. address: dest addr from 138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345 Network Layer 4-35

  14. NAT: Network Address Translation  16-bit port-number field:  60,000 simultaneous connections with a single LAN-side address!  NAT is controversial:  routers should only process up to layer 3  violates end-to-end argument • NAT possibility must be taken into account by app designers, eg, P2P applications  address shortage should instead be solved by IPv6 Network Layer 4-36

  15. ICMP: Internet Control Message Protocol  used by hosts & routers to Type Code description communicate network-level 0 0 echo reply (ping) information 3 0 dest. network unreachable  error reporting: 3 1 dest host unreachable unreachable host, network, 3 2 dest protocol unreachable port, protocol 3 3 dest port unreachable  echo request/reply (used 3 6 dest network unknown by ping) 3 7 dest host unknown  network-layer “ above ” IP: 4 0 source quench (congestion  ICMP msgs carried in IP control - not used) datagrams 8 0 echo request (ping) 9 0 route advertisement  ICMP message: type, code plus 10 0 router discovery first 8 bytes of IP datagram 11 0 TTL expired causing error 12 0 bad IP header Network Layer 4-37

  16. Traceroute and ICMP  Source sends series of  When ICMP message UDP segments to dest arrives, source calculates RTT  First has TTL =1  Traceroute does this 3  Second has TTL=2, etc. times  Unlikely port number  When nth datagram arrives Stopping criterion to nth router:  UDP segment eventually arrives at destination host  Router discards datagram  And sends to source an  Destination returns ICMP ICMP message (type 11, “ host unreachable ” packet code 0) (type 3, code 3)  Message includes name of  When source gets this router& IP address ICMP, stops. Network Layer 4-38

  17. IPv6: motivation  initial motivation: 32-bit address space soon to be completely allocated.  additional motivation:  header format helps speed processing/forwarding  header changes to facilitate QoS IPv6 datagram format:  fixed-length 40 byte header  no fragmentation allowed 4-39 Network Layer

  18. IPv6 datagram format priority: identify priority among datagrams in flow flow Label: identify datagrams in same “ flow. ” (concept of “ flow ” not well defined). next header: identify upper layer protocol for data pri ver flow label hop limit payload len next hdr source address (128 bits) destination address (128 bits) data 32 bits 4-40 Network Layer

  19. Other changes from IPv4  checksum : removed entirely to reduce processing time at each hop  options: allowed, but outside of header, indicated by “ Next Header ” field  ICMPv6: new version of ICMP  additional message types, e.g. “ Packet Too Big ”  multicast group management functions 4-41 Network Layer

  20. Transition from IPv4 to IPv6  not all routers can be upgraded simultaneously  no “ flag days ”  how will network operate with mixed IPv4 and IPv6 routers?  tunneling: IPv6 datagram carried as payload in IPv4 datagram among IPv4 routers IPv6 header fields IPv4 header fields IPv4 payload IPv6 source dest addr IPv4 source, dest addr UDP/TCP payload IPv6 datagram IPv4 datagram 4-42 Network Layer

  21. Tunneling IPv4 tunnel A B E F connecting IPv6 routers logical view: IPv6 IPv6 IPv6 IPv6 C A B D E F physical view: IPv6 IPv6 IPv6 IPv6 IPv4 IPv4 4-43 Network Layer

  22. Tunneling IPv4 tunnel A B E F connecting IPv6 routers logical view: IPv6 IPv6 IPv6 IPv6 C A B D E F physical view: IPv6 IPv6 IPv6 IPv6 IPv4 IPv4 src:B src:B flow: X flow: X src: A dest: E src: A dest: E dest: F dest: F Flow: X Flow: X Src: A Src: A Dest: F data Dest: F data data data A-to-B: E-to-F: B-to-C: B-to-C: IPv6 IPv6 IPv6 inside IPv6 inside IPv4 IPv4 4-44 Network Layer

  23. IPv6 adaption  Google: 8% of clients access services via IPv6  NIST: 1/3 of all US government domains are IPv6 capable  Long time for deployment  it has been 20 years and counting!  think of application-level changes in last 20 years: WWW, Facebook, streaming media, Skype, …  Why? 4-45 Network Layer

  24. Interplay between routing and forwarding routing algorithm local forwarding table header value output link 0100 3 0101 2 0111 2 1001 1 value in arriving packet ’ s header 1 0111 2 3 Network Layer 4-46

  25. Graph abstraction 5 v 3 w 5 2 u z 2 1 3 1 x y 2 1 Graph: G = (N,E) N = set of routers = { u, v, w, x, y, z } E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) } Remark: Graph abstraction is useful in other network contexts Example: P2P, where N is set of peers and E is set of TCP connections Network Layer 4-47

  26. Graph abstraction: costs 5 • c(x,x ’ ) = cost of link (x,x ’ ) v 3 w - e.g., c(w,z) = 5 5 2 u z 2 1 • cost could always be 1, or 3 1 inversely related to bandwidth, x y 2 or inversely related to 1 congestion Cost of path (x 1 , x 2 , x 3 , … , x p ) = c(x 1 ,x 2 ) + c(x 2 ,x 3 ) + … + c(x p-1 ,x p ) Question: What ’ s the least-cost path between u and z ? Routing algorithm: algorithm that finds least-cost path Network Layer 4-48

  27. Routing Algorithm classification Global or decentralized Static or dynamic? information? Static: Global:  routes change slowly  all routers have complete over time topology, link cost info Dynamic:  “ link state ” algorithms  routes change more Decentralized: quickly  router knows physically- connected neighbors, link  periodic update costs to neighbors  in response to link  iterative process of cost changes computation, exchange of info with neighbors  “ distance vector ” algorithms Network Layer 4-49

  28. A Link-State Routing Algorithm Dijkstra ’ s algorithm Notation:  net topology, link costs  c(x,y): link cost from node known to all nodes x to y; = ∞ if not direct  accomplished via “ link neighbors state broadcast ”  D(v): current value of cost  all nodes have same info of path from source to  computes least cost paths dest. v from one node ( ‘ source ” ) to  p(v): predecessor node all other nodes along path from source to v  gives forwarding table  N': set of nodes whose for that node least cost path definitively  iterative: after k known iterations, know least cost path to k dest. ’ s Network Layer 4-50

  29. Dijsktra ’ s Algorithm 1 Initialization: 2 N' = {u} 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(u,v) 6 else D(v) = ∞ 7 8 Loop 9 find w not in N' such that D(w) is a minimum 10 add w to N' 11 update D(v) for all v adjacent to w and not in N' : 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N' Network Layer 4-51

  30. Dijkstra ’ s algorithm: example Step D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) N' D(z),p(z) ∞ ∞ 0 2,u 5,u 1,u u ∞ 1 2,u 4,x 2,x ux 2 2,u 3,y uxy 4,y 3,y 3 uxyv 4,y 4 uxyvw 4,y 5 uxyvwz 5 v 3 w 5 2 u z 2 1 3 1 x y 2 1 Network Layer 4-52

  31. Dijkstra ’ s algorithm, discussion Algorithm complexity: n nodes  each iteration: need to check all nodes, w, not in N  n(n+1)/2 comparisons: O(n 2 )  more efficient implementations possible: O(nlogn) Oscillations possible:  e.g., link cost = amount of carried traffic A A A A 1 1+e 2+e 0 2+e 0 2+e 0 D B D B D B D B 0 0 1+e 1 0 0 1+e 1 e 0 0 C 0 1 e 1+e C 0 C C 1 1 e … recompute … recompute … recompute initially routing Network Layer 4-53

  32. Distance Vector Algorithm (1) Bellman-Ford Equation (dynamic programming) Define d x (y) := cost of least-cost path from x to y Then d x (y) = min {c(x,v) + d v (y) } where min is taken over all neighbors of x Network Layer 4-54

  33. Bellman-Ford example (2) 5 Clearly, d v (z) = 5, d x (z) = 3, d w (z) = 3 v 3 w 5 2 B-F equation says: u z 2 1 3 d u (z) = min { c(u,v) + d v (z), 1 x y 2 c(u,x) + d x (z), 1 c(u,w) + d w (z) } = min {2 + 5, 1 + 3, 5 + 3} = 4 Node that achieves minimum is next hop in the shortest path forwarding table Network Layer 4-55

  34. Distance Vector Algorithm (3)  D x (y) = estimate of least cost from x to y  Distance vector: D x = [D x (y): y є N ]  Node x knows cost to each neighbor v: c(x,v)  Node x maintains D x = [D x (y): y є N ]  Node x also maintains its neighbors ’ distance vectors  For each neighbor v, x maintains D v = [D v (y): y є N ] Network Layer 4-56

  35. Distance vector algorithm (4) Basic idea:  Each node periodically sends its own distance vector estimate to neighbors  When node a node x receives new DV estimate from neighbor, it updates its own DV using B-F equation: D x (y) ← min v {c(x,v) + D v (y)} for each node y  N  Under minor, natural conditions, the estimate D x (y) converge the actual least cost d x (y) Network Layer 4-57

  36. Distance Vector Algorithm (5) Iterative, asynchronous: Each node: each local iteration caused by:  local link cost change wait for (change in local link  DV update message from cost of msg from neighbor) neighbor Distributed: recompute estimates  each node notifies neighbors only when its DV changes if DV to any dest has  neighbors then notify changed, notify neighbors their neighbors if necessary Network Layer 4-58

  37. Distance Vector Algorithm: At all nodes, X: 1 Initialization: 2 for all adjacent nodes v: X 3 D (*,v) = infinity /* the * operator means "for all rows" */ X 4 D (v,v) = c(X,v) 5 for all destinations, y X 6 send min D (y,w) to each neighbor /* w over all X's neighbors */ w Network Layer 4-59

  38. Distance Vector Algorithm (cont.): 8 loop 9 wait (until I see a link cost change to neighbor V 10 or until I receive update from neighbor V) 11 12 if (c(X,V) changes by d) 13 /* change cost to all dest's via neighbor v by d */ 14 /* note: d could be positive or negative */ 15 for all destinations y: D (y,V) = D (y,V) + d X X 16 17 else if (update received from V wrt destination Y) 18 /* shortest path from V to some Y has changed */ 19 /* V has sent a new value for its min DV(Y,w) */ w 20 /* call this received new value is "newval" */ X 21 for the single destination y: D (Y,V) = c(X,V) + newval 22 X 23 if we have a new min D (Y,w)for any destination Y w X 24 send new value of min D (Y,w) to all neighbors w 25 Network Layer 4-60 26 forever

  39. Distance Vector Algorithm: example Y 2 1 X Z 7 Network Layer 4-61

  40. Distance Vector Algorithm: example Y 2 1 X Z Z X c(X,Z) + min {D (Y,w)} D (Y,Z) 7 = w = 7+1 = 8 Y X c(X,Y) + min {D (Z,w)} D (Z,Y) = w = 2+1 = 3 Network Layer 4-62

  41. Distance Vector: link cost changes Link cost changes: 1  node detects local link cost change Y 4  updates distance table (line 15) 1 X Z  if cost change in least cost path, 50 notify neighbors (lines 23,24) algorithm terminates “ good news travels fast ” Network Layer 4-63

  42. Distance Vector: link cost changes Link cost changes: 60 Y  good news travels fast 4 1  bad news travels slow - X Z “ count to infinity ” problem! 50 algorithm continues on! Network Layer 4-64

  43. Distance Vector: poisoned reverse If Z routes through Y to get to X : 60 Y  Z tells Y its (Z ’ s) distance to X is 4 1 infinite (so Y won ’ t route to X via Z) X Z  will this completely solve count to 50 infinity problem? algorithm terminates Network Layer 4-65

  44. Comparison of LS and DV algorithms Message complexity Robustness: what happens if router malfunctions?  LS: with n nodes, E links, O(nE) msgs sent LS:  DV: exchange between  node can advertise neighbors only incorrect link cost  convergence time varies  each node computes only its own table Speed of Convergence DV:  LS: O(n 2 ) algorithm requires O(nE) msgs  DV node can advertise incorrect path cost  may have oscillations  each node ’ s table used by  DV: convergence time varies others  may be routing loops • error propagate thru  count-to-infinity problem network Network Layer 4-66

  45. Hierarchical Routing Our routing study thus far - idealization  all routers identical  network “ flat ” … not true in practice scale: with 200 million administrative autonomy destinations:  internet = network of networks  can ’ t store all dest ’ s in routing tables!  each network admin may want to control routing in its  routing table exchange own network would swamp links! Network Layer 4-67

  46. Internet approach to scalable routing aggregate routers into regions known as “ autonomous systems ” (AS) (a.k.a. “ domains ” ) intra-AS routing inter-AS routing  routing among hosts,  routing among AS ’ es routers in same AS  gateways perform ( “ network ” ) inter-domain routing  all routers in AS must run (as well as intra- same intra-domain protocol domain routing)  routers in different AS can run different intra-domain routing protocol  gateway router: at “ edge ” of its own AS, has link(s) to router(s) in other AS ’ es Network Layer 5-68

  47. Interconnected ASes 3c 3a 2c 3b 2a AS3 2b 1c AS2 1a 1b AS1 1d  Forwarding table is configured by both intra- and inter-AS routing algorithm Intra-AS Inter-AS Routing Routing  Intra-AS sets entries algorithm algorithm for internal dests Forwarding  Inter-AS & Intra-As table sets entries for external dests Network Layer 4-69

  48. Inter-AS tasks AS1 needs:  Suppose router in AS1 to learn which dests 1. receives datagram for are reachable through which dest is outside AS2 and which of AS1 through AS3  Router should forward 2. to propagate this packet towards on of reachability info to all the gateway routers, routers in AS1 but which one? Job of inter-AS routing! 3c 3a 2c 3b 2a AS3 2b 1c AS2 1a 1b AS1 1d Network Layer 4-70

  49. Intra-AS Routing  Also known as Interior Gateway Protocols (IGP)  Most common Intra-AS routing protocols:  RIP: Routing Information Protocol  OSPF: Open Shortest Path First  IGRP: Interior Gateway Routing Protocol (Cisco proprietary) Network Layer 4-71

  50. RIP ( Routing Information Protocol)  Distance vector algorithm  Included in BSD-UNIX Distribution in 1982  Distance metric: # of hops (max = 15 hops) destination hops u v u 1 w A B v 2 w 2 x 3 y 3 x C D z 2 z y Network Layer 4-72

  51. RIP advertisements  Distance vectors: exchanged among neighbors every 30 sec via Response Message (also called advertisement)  Each advertisement: list of up to 25 destination nets within AS Network Layer 4-73

  52. RIP: Example z w x y A D B C Destination Network Next Router Num. of hops to dest. w A 2 y B 2 z B 7 x -- 1 … . … . .... Routing table in D Network Layer 4-74

  53. RIP: Example Dest Next hops Advertisement w - - from A to D x - - z C 4 … . … ... z w x y A D B C Destination Network Next Router Num. of hops to dest. w A 2 y B 2 z B A 7 5 x -- 1 … . … . .... Routing table in D Network Layer 4-75

  54. RIP: Link Failure and Recovery If no advertisement heard after 180 sec --> neighbor/link declared dead  routes via neighbor invalidated  new advertisements sent to neighbors  neighbors in turn send out new advertisements (if tables changed)  link failure info quickly propagates to entire net  poison reverse used to prevent ping-pong loops (infinite distance = 16 hops) Network Layer 4-76

  55. RIP Table processing  RIP routing tables managed by application-level process called route-d (daemon)  advertisements sent in UDP packets, periodically repeated routed routed Transport Transport (UDP) (UDP) network forwarding forwarding network (IP) table (IP) table link link physical physical Network Layer 4-77

  56. OSPF (Open Shortest Path First)  “ open ” : publicly available  Uses Link State algorithm  LS packet dissemination  Topology map at each node  Route computation using Dijkstra ’ s algorithm  OSPF advertisement carries one entry per neighbor router  Advertisements disseminated to entire AS (via flooding)  Carried in OSPF messages directly over IP (rather than TCP or UDP Network Layer 4-78

  57. OSPF “ advanced ” features (not in RIP)  Security: all OSPF messages authenticated (to prevent malicious intrusion)  Multiple same-cost paths allowed (only one path in RIP)  For each link, multiple cost metrics for different TOS (e.g., satellite link cost set “ low ” for best effort; high for real time)  Integrated uni- and multicast support:  Multicast OSPF (MOSPF) uses same topology data base as OSPF  Hierarchical OSPF in large domains. Network Layer 4-79

  58. Hierarchical OSPF Network Layer 4-80

  59. Hierarchical OSPF  Two-level hierarchy: local area, backbone.  Link-state advertisements only in area  each nodes has detailed area topology; only know direction (shortest path) to nets in other areas.  Area border routers: “ summarize ” distances to nets in own area, advertise to other Area Border routers.  Backbone routers: run OSPF routing limited to backbone.  Boundary routers: connect to other AS ’ s. Network Layer 4-81

  60. Internet inter-AS routing: BGP  BGP (Border Gateway Protocol): the de facto inter-domain routing protocol  “ glue that holds the Internet together ”  BGP provides each AS a means to:  eBGP: obtain subnet reachability information from neighboring ASes  iBGP: propagate reachability information to all AS- internal routers.  determine “ good ” routes to other networks based on reachability information and policy  allows subnet to advertise its existence to rest of Internet: “ I am here ” Network Layer 5-82

  61. eBGP, iBGP connections 2b 2a 2c ∂ 1b 3b 2d 1a 1c 3a ∂ 3c AS 2 1d 3d AS 1 AS 3 eBGP connectivity iBGP connectivity gateway routers run both eBGP and iBGP protools 1c Network Layer 5-83

  62. BGP basics  BGP session: two BGP routers ( “ peers ” ) exchange BGP messages over semi-permanent TCP connection: • advertising paths to different destination network prefixes (BGP is a “ path vector ” protocol)  when AS3 gateway router 3a advertises path AS3,X to AS2 gateway router 2c:  AS3 promises to AS2 it will forward datagrams towards X AS 3 3b AS 1 1b 3a 3c 1a 1c AS 2 2b 3d X 1d BGP advertisement: 2a 2c AS3, X 2d Network Layer 5-84

  63. Distributing reachability info  With eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1.  1c can then use iBGP do distribute this new prefix reach info to all routers in AS1  1b can then re-advertise the new reach info to AS2 over the 1b-to-2a eBGP session  When router learns about a new prefix, it creates an entry for the prefix in its forwarding table. 3c 2c 3a 3b 2a AS3 2b 1c AS2 1a 1b 1d AS1 eBGP session iBGP session Network Layer 4-85

  64. Path attributes & BGP routes  When advertising a prefix, advert includes BGP attributes.  prefix + attributes = “ route ”  Two important attributes:  AS-PATH: contains the ASs through which the advert for the prefix passed: AS 67 AS 17  NEXT-HOP: Indicates the specific internal-AS router to next-hop AS. (There may be multiple links from current AS to next-hop-AS.)  When gateway router receives route advert, uses import policy to accept/decline. Network Layer 4-86

  65. BGP route selection  Router may learn about more than 1 route to some prefix. Router must select route.  Elimination rules: Local preference value attribute: policy 1. decision Shortest AS-PATH 2. Closest NEXT-HOP router: hot potato routing 3. Additional criteria 4. Network Layer 4-87

  66. BGP messages  BGP messages exchanged using TCP.  BGP messages:  OPEN: opens TCP connection to peer and authenticates sender  UPDATE: advertises new path (or withdraws old)  KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request  NOTIFICATION: reports errors in previous msg; also used to close connection Network Layer 4-88

  67. BGP routing policy legend: provider B network X W A customer C network: Y Figure 4.5-BGPnew : a simple BGP scenario  A,B,C are provider networks  X,W,Y are customer (of provider networks)  X is dual-homed: attached to two networks  X does not want to route from B via X to C  .. so X will not advertise to B a route to C Network Layer 4-89

  68. BGP routing policy (2) legend: provider B network X W A customer C network: Y  A advertises to B the path AW Figure 4.5-BGPnew : a simple BGP scenario  B advertises to X the path BAW  Should B advertise to C the path BAW?  No way! B gets no “ revenue ” for routing CBAW since neither W nor C are B ’ s customers  B wants to force C to route to w via A  B wants to route only to/from its customers! Network Layer 4-90

  69. Why different Intra- and Inter-AS routing ? Policy:  Inter-AS: admin wants control over how its traffic routed, who routes through its net.  Intra-AS: single admin, so no policy decisions needed Scale:  hierarchical routing saves table size, reduced update traffic Performance :  Intra-AS: can focus on performance  Inter-AS: policy may dominate over performance Network Layer 4-91

  70. Router Architecture Overview Two key router functions:  run routing algorithms/protocol (RIP, OSPF, BGP)  forwarding datagrams from incoming to outgoing link Network Layer 4-92

  71. Input Port Functions Physical layer: bit-level reception Decentralized switching : Data link layer:  given datagram dest., lookup output port e.g., Ethernet using forwarding table in input port see chapter 5 memory  goal: complete input port processing at ‘ line speed ’  queuing: if datagrams arrive faster than forwarding rate into switch fabric Network Layer 4-93

  72. Three types of switching fabrics Network Layer 4-94

  73. Switching Via Memory First generation routers:  traditional computers with switching under direct control of CPU  packet copied to system ’ s memory  speed limited by memory bandwidth (2 bus crossings per datagram) Memory Input Output Port Port System Bus Network Layer 4-95

  74. Switching Via a Bus  datagram from input port memory to output port memory via a shared bus  bus contention: switching speed limited by bus bandwidth  1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (not regional or backbone) Network Layer 4-96

  75. Switching Via An Interconnection Network  overcome bus bandwidth limitations  Banyan networks, other interconnection nets initially developed to connect processors in multiprocessor  Advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric.  Cisco 12000: switches Gbps through the interconnection network Network Layer 4-97

  76. Input port queuing  fabric slower than input ports combined -> queueing may occur at input queues  queueing delay and loss due to input buffer overflow!  Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward switch switch fabric fabric output port contention: one packet time later: only one red datagram can be green packet transferred. experiences HOL lower red packet is blocked blocking 4-98 Network Layer

  77. Output ports datagram buffer link switch line layer fabric termination protocol (send) queueing  buffering required when datagrams arrive from fabric faster than the transmission rate  Packets can be lost due to congestion, lack of buffers  scheduling discipline chooses among queued datagrams for transmission Network Layer 4-99

  78. Output port queueing switch switch fabric fabric one packet time at t, packets more later from input to output  buffering when arrival rate via switch exceeds output line speed  queueing (delay) and loss due to output port buffer overflow! 4-100 Network Layer

Recommend


More recommend