ethernet oam
play

Ethernet OAM Victor Olifer (JANET/GEANT JRA1 Task 1) JRA1/TERENA - PowerPoint PPT Presentation

Ethernet OAM Victor Olifer (JANET/GEANT JRA1 Task 1) JRA1/TERENA workshop, Copenhagen, 20 November 2012 connect communicate collaborate 1 Agenda Ethernet Service Assurance & Monitoring overview Monitoring standards Service


  1. Ethernet OAM Victor Olifer (JANET/GEANT JRA1 Task 1) JRA1/TERENA workshop, Copenhagen, 20 November 2012 connect • communicate • collaborate 1

  2. Agenda Ethernet Service Assurance & Monitoring overview  Monitoring standards  Service assurance standards Service assurance lab trials CFM/Y.1731 trial  Multi-domain testbed  OAM agent boxes  CyPortal JRA1 & JRA2 trial (Year 4 extension)  Multi-segment connections  Diverse equipment  perfSONAR extensions connect • communicate • collaborate 2

  3. Wide-area point-to-point Ethernet connections Ethernet over MPLS Ethernet over Transport Ethernet Multi-segment multi-domain connection with: - Ethernet UNI (a must); - segments of pure Ethernet (optional); - segments where Ethernet is tunneled over some other technology, e.g TDM (SDH, OTN) or MPLS (optional) Where we can find such connections? - GEANT Plus, JANET Lightpath: demand is from big projects, large scientific centres - Inter-router connections - An offer from commercial providers: they had 20% revenue growth in 2010 over 2009. Mobile backhaul and multi-site corporates are major users; the reasons – price and flexibility - New demand for academic providers might arise from such areas as cloud services, data centres, HD videoconferences, multi-site university connections connect • communicate • collaborate 3

  4. Problems with managing Ethernet connections Until recently Ethernet had no OAM tools (hence cheapest equipment) -> no way to check, monitor and troubleshoot connectivity and performance end-to-end ( a customer view) or within a domain (a provider view). E.g. comparing to IP experience: No ping, traceroute and ICMP diagnostic messages available. Partial solution: we can use MPLS or SDH/OTN OAM to manage tunnels Good news: Ethernet OAM functions started being developed and implemented in equipment since 2007-8 Bad news: We (JANET) don’t have much experience in Ethernet OAM use. The same situation in other NRENs (as far as I know from GEANT3 participants). connect • communicate • collaborate 4

  5. Three areas of emerging Ethernet OAM standards • Checks whether a connection performs to its specs, e.g. up to CIR and EIR, after service configuration and Service activation. assurance • Periodic checks of connection connectivity (continuity) and performance (delay, loss, throughput, availability) Service monitoring • When monitoring shows a fault one needs to locate a faulty point along a path and possible reason(s) of a Service Service failure trouble shooting connect • communicate • collaborate 5

  6. Service Assurance (1) 1. Service definitions (topology: e.g. point-to-point, bandwidth profile: CIR, EIR for several CoS): • MEF 10.2 • ITU-T G.8011 Very important as it is often a cause of confusions: e.g. CIR might be measured for UDP payload or Ethernet frames – very different figures for the same data flow 2. Service performance parameters (delay, loss, throughput, availability): • MEF 10.2.1 • Y.1563 connect • communicate • collaborate 6

  7. Service Assurance (2) 3.Service Verification Relatively new (Summer 2011) ITU-T spec Y.1564 “Ethernet service activation test methodology” • Defines a simple disruptive on-demand procedure that tests connectivity and throughput up to CIR & EIR & policing limit by injecting traffic into a connection • More suitable for Ethernet than complex and IP-centric RFC2544; implemented in many traffic generators connect • communicate • collaborate and boxes 7

  8. Service Assurance trials JANET lab trial of SunRise RxT tester Positive impression, works according the standard, looks worth to try in wide-area tests Tester PIR Box PIR=CIR+EIR CIR Just one problem: Y.1564 doesn't’t give an opportunity to detect the situation when real PIR value set up lower connect • communicate • collaborate than expected (not box bug, just the standard intention) 8

  9. Service Monitoring  IEEE 802.1ag Connectivity Fault Management (CFM) (ratified in 2007): - Hierarchical sessions of heartbeat messages (Continuity Check Messages, CCM) -> up/down status check - VLAN-aware - MEP (End) and MIP (Intermediate) maintenance points  ITU-T Y.1731 (ratified in 2008): Same as CFM + Performance monitoring (delay, loss, throughput) Customer maintenance session level 7 Service provider maintenance session level 5 Operator maintenance sessions level 3 connect • communicate • collaborate 9

  10. Service Troubleshooting  CFM : - Linktrace (analogy of IP traceroute ) - Loopback (analogy of IP ping ) - RDI (Remote Defect Indication)  Y.1731 : - same as CFM + a richer set of diagnostic messages + performance monitoring (loss, delay, throughput) : - Alarm Indication Signal (AIS) - Lock Signal - … connect • communicate • collaborate 10

  11. Service monitoring trials JRA1 Task 1 Ethernet OAM trial (2011): - 5 NRENs, 5 connections under 6 months monitoring - Small Y.1731 agent boxes from Overture - CyPortal from Cyan Optics for storing and visualising of monitoring data Positive results but only for single-segment connections Combined JRA1 Task 1& JRA2 Task 3 Service Assurance & Monitoring trial GN3 Year 4 (2012-2013) - ongoing connect • communicate • collaborate 11

  12. JRA 1 Ethernet OAM trial (2011) objectives  Test CFM/Y.1731 functions in multi-domain and multi-vendor environment (5 connections)  Evaluate Y.1731 agent boxes  Evaluate OAM data visualisation system (CyPortal) Essex Uni JANET LH Cyan OAM portal Collector OAM Cloud service NORDUnet Data from Collector Equipment under test OAM agent (Overture ISG24) Monitored VLAN connections SURFnet CESNET PIONIER (PSNC) connect • communicate • collaborate 12

  13. OAM agent options Dedicated extra network switch with advanced OAM capabilities  Pros: uniform, rich OAM functionality, and consistent source of monitoring data  Cons: extra boxes overheads (adds complexity, cost – especially for high speed links, maintenance etc) OAM capabilities of existing network boxes: routers, switches, muxes  Pros: no extra equipment, ability to test internal segments  Cons: some vendor-specific features, e.g. in CFM MIBs – diverse environment with possible incompatibilities Software OAM agent on a dedicated server (e.g. ‘dot1ag -utils ’ developed by SARA and presented by Ronald van der Pol at NORDUnet 2011)  Pros: end users can ping and trace network elements; no switches needed  Cons: currently limited to MEP down functionality, performance depends on a server performance, time precision might be an issue connect • communicate • collaborate 13

  14. ISG24 OAM agent box trial  Compact 4 port GE demarcation box, low cost (~ $1000)  2 copper GE and 2 SFP ports (there is 10GE version)  Web GUI  OAM functions:  CFM  Y.1731 D(elay)MM and L(oss)M  RFC 2544  PAA – proprietary analogy of Y.1731  Ethernet First Mile 802.2ag connect • communicate • collaborate 14

  15. ISG24 CCM (continuity) tests  Positive results – properly detected the Up/Down state of all 5 connections by permanent monitoring over 6 months  Compact web form  Detailed web form connect • communicate • collaborate 15

  16. ISG 24 DMM (performance) tests  Mostly positive results – CFM and PAA Delay Measurement sessions showed stable and close to expected (from other sources) One Way and Two Ways delays and jitter results Janet – NORDUnet PAA results: PSNC– CESNET CFM DMM results:  We experienced some problems with CFM One Way delay measurements on two connections – will talk later after CyPortal slides connect • communicate • collaborate 16

  17. CyPortal: monitoring data storage and visualisation  Detailed monitoring data are collected from ISG24 agent boxes and stored in a cloud-based database  Web GUI provides a map of all services; in red those which current parameters violate SLD connect • communicate • collaborate 17

  18. CyPortal: Per- service data  Historical graphical presentation of all parameters under monitoring  Zooming of a selected time period  Setting of SLA limits  Flexible reports connect • communicate • collaborate 18

  19. Problems encountered 1. Saw-tooth shape of delay between JANET LH and Essex Uni Level 5 DDM session  There was no reason for saw-tooth shape of Two Way Delay with peaks of about 1 sec showed by MEP Level 5 (ISG24 box) Level 3 DDM session  Capturing and analyzing traffic before and after MEP Level 3 (Ciena 311v box) showed the ‘guilty’ box:  MEP Level 3 time-stamped packets of MEP Level 5 instead of their transparent forwarding – definitely a bug in a box software connect • communicate • collaborate 19

  20. Problems encountered (cont.) 2. Inability of ISG boxes to measure CFM One Way Delay on some connections (LH-Copenhagen, LH-Essex) PAA: OAD = 10. 903 TWD = 23,004 CFM DMM: OAD = ---- TWD = 23,004 ISG vendor version: too poor synchronization to calculate CFM OWD Seems not to be true: why it is enough for proprietary PAA Needs further investigation ! connect • communicate • collaborate 20

Recommend


More recommend