High Performance Network Monitoring Challenges for Grids Les Cottrell , Presented at the Internation Symposium on Grid Computing 2006, Taiwan www.slac.stanford.edu/grp/scs/net/talk05/iscg-06.ppt Partially funded by DOE/MICS for Internet End-to-end 1 Performance Monitoring (IEPM) וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף �ِ�ِ�
Why & Outline • Data intensive sciences (e.g. HEP) needs to move large volumes of data worldwide – Requires understanding and effective use of fast networks – Requires continuous monitoring • For HEP LHC-OPN focus on tier 0 and tier 1 sites, i.e. just a few sites • Outline of talk: – What does monitoring provide? – Active E2E measurements today and challenges – Visualization, forecasting, problem ID – Passive monitoring • Netflow, • SNMP, • Conclusions 2 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Uses of Measurements • Automated problem identification & trouble shooting: – Alerts for network administrators, e.g. • Bandwidth changes in time-series, iperf, SNMP – Alerts for systems people • OS/Host metrics • Forecasts for Grid Middleware, e.g. replica manager, data placement • Engineering, planning, SLA (set & verify) • Also (not addressed here): – Security: spot anomalies, intrusion detection – Accounting 3 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
• Several NRENs, layer 2 & 3 • Level of access an open issue 4 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
LHC-OPN: Logical view • The diagram to the right is a logical representation of the LHC-OPN showing monitoring hosts • The LHC-OPN extends to just inside the T1 “ edge ” • Read/query access should be guaranteed on LHC-OPN “ owned ” equipment. • We also request RO access to devices along the path to enable quick fault isolation 5 Courtesy: Shawn McKee וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Active E2E Monitoring 6 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
E.g. Using Active IEPM-BW measurements • Focus on high performance for a few hosts needing to send data to a small number of collaborator sites, e.g. HEP tiered model • Makes regular measurements with tools – Ping (RTT, connectivity), traceroute – pathchirp, ABwE, pathload (packet pair dispersion) – iperf (single & multi-stream), thrulay, – Possibly bbftp, bbcp (file transfer applications) • Looking at GridFTP but complex requiring renewing certificates • Lots of analysis and visualization • Running at major HEP sites: CERN, SLAC, FNAL, BNL, Caltech to about 40 remote sites – http://www.slac.stanford.edu/comp/net/iepm- 7 bw.slac.stanford.edu/slac_wan_bw_tests.html וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
IEPM-BW Measurement Topology • 40 target hosts in 13 countries • Bottlenecks vary from 0.5Mbits/s to 1Gbits/s • Traverse ~ 50 AS ’ , 15 major Internet providers • 5 targets at PoPs, rest at end sites 8 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Ping/traceroute • Ping still useful ( plus ca reste … ) – Is path connected/node reachable? – RTT, jitter, loss – Great for low performance links (e.g. Digital Divide), e.g. AMP (NLANR)/PingER (SLAC) – Nothing to install, but blocking • OW AMP/I2 similar but O ne W ay – But needs server installed at other end and good timers – Being built into IEPM-BW • Traceroute – Needs good visualization (traceanal/SLAC) – Little use for dedicated λ layer 1 or 2 9 – However still want to know topology of paths וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Bottleneck Packet Pair Dispersion Min spacing Spacing preserved At bottleneck • Send packets with known separation On higher speed links • See how separation changes due to bottleneck • Can be low network intrusive, e.g. ABwE only 20 packets/direction, also fast < 1 sec • From PAM paper, pathchirp more accurate than ABwE, but – Ten times as long (10s vs 1s) – More network traffic (~factor of 10) • Pathload factor of 10 again more – http://www.pam2005.org/PDF/34310310.pdf • IEPM-BW now supports ABwE, Pathchirp, Pathload 10 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
BUT … • Packet pair dispersion relies on accurate timing of inter packet separation – At > 1Gbps this is getting beyond resolution of Unix clocks – AND 10GE NICs are offloading function • Coalescing interrupts, Large Send & Receive Offload, TOE • Need to work with TOE vendors – Turn off offload (Neterion supports multiple channels, can eliminate offload to get more accurate timing in host) – Do timing in NICs – No standards for interfaces • Possibly packet trains, e.g. pathneck 11 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Achievable Throughput • Use TCP or UDP to send as much data as can memory to memory from source to destination • Tools: iperf (bwctl/I2), netperf, thrulay (from Stas Shalunov/I2), udpmon … • Pseudo file copy: Bbcp and GridFTP also have memory to memory mode 12 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
BUT … • At 10Gbits/s on transatlantic path Slow start takes over 6 seconds – To get 90% of measurement in congestion avoidance need to measure for 1 minute (5.25 GBytes at 7Gbits/s (today ’ s typical performance) • Needs scheduling to scale, even then … • It ’ s not disk-to-disk or application-to application – So use bbcp, bbftp, or GridFTP 13 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
AND … • For testbeds such as UltraLight, UltraScienceNet etc. have to reserve the path – So the measurement infrastructure needs to add capability to reserve the path (so need API to reservation application) – OSCARS from ESnet developing a web services interface (http://www.es.net/oscars/): • For lightweight have a “ persistent ” capability • For more intrusive, must reserve just before make measurement 14 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Visualization & Forecasting 15 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Examples of real data Caltech : thrulay • Misconfigured windows 800 • New path Mbps 0 • Very noisy Nov05 Mar06 UToronto: miperf • Seasonal effects 250 – Daily & weekly Mbps 0 Jan06 Nov05 Pathchirp • Some are seasonal UTDallas 120 • Others are not thrulay Mbps • Events may affect 0 iperf Mar-20-06 Mar-10-06 multiple-metrics • Events can be caused by host or site congestion • Few route changes result in bandwidth changes (~20%) • Many significant events are not associated with route 16 changes (~50%) וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Changes in netw ork topology (BGP) can result in dramatic changes in performance Hour Samples of traceroute trees generated from the Remote host table ) s p b M 0 0 1 ( s o t t e N - s o L Snapshot of traceroute summary table Notes: 1. Caltech misrouted via Los-Nettos 100Mbps commercial net 14:00-17:00 2. ESnet/GEANT working on routes from 2:00 to 14:00 3. A previous occurrence went un-noticed for 2 months 4. Next step is to auto detect and notify Drop in performance Back to original path Dynamic BW capacity (DBC) (From original path: SLAC-CENIC-Caltech to SLAC-Esnet-LosNettos (100Mbps) -Caltech ) Mbits/s Changes detected by IEPM-Iperf and AbWE Available BW = (DBC-XT) Cross-traffic (XT) Esnet-LosNettos segment in the path 17 (100 Mbits/s) ABwE measurement one/minute for 24 hours Thurs Oct 9 9:00am to Fri Oct 10 9:01am וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Forecasting • Over-provisioned paths should have pretty flat time series • But seasonal trends (diurnal, weekly need to be accounted for) on about 10% of our paths • Use Holt-Winters triple exponential weighted moving averages – Short/local term smoothing – Long term linear trends 18 – Seasonal smoothing וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Alerting • Have false positives down to reasonable level, so sending alerts • Experimental • Typically few per week. • Currently by email to network admins – Adding pointers to extra information to assist admin in further diagnosing the problem, including: • Traceroutes, monitoring host parms, time series for RTT, pathchirp, thrulay etc. • Plan to add on-demand measurements (excited about perfSONAR) 19 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
In progress • Integrate IEPM-BW and PingER measurements with MonALISA to provide additional access • Working to make traceanal a callable module – Integrating with AMP • When comfortable with forecasting, event detection will generalize • Looking at ARMA/ARIMA for forecasting 20 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Passive - Netflow 21 וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףף�ِ�ِ�
Recommend
More recommend