egee asia pacific regional operation center
play

EGEE Asia Pacific Regional Operation Center Min-Hong Tsai ASGC - PowerPoint PPT Presentation

Enabling Grids for E-sciencE EGEE Asia Pacific Regional Operation Center Min-Hong Tsai ASGC ISGC 2007 March 29, Taipei http://www.eu-egee.org/ http://www.twgrid.org/aproc/ www.eu-egee.org EGEE-II INFSO-RI-031688 Agenda Enabling Grids for


  1. Enabling Grids for E-sciencE EGEE Asia Pacific Regional Operation Center Min-Hong Tsai ASGC ISGC 2007 March 29, Taipei http://www.eu-egee.org/ http://www.twgrid.org/aproc/ www.eu-egee.org EGEE-II INFSO-RI-031688

  2. Agenda Enabling Grids for E-sciencE • APROC Introduction • Status • Joining EGEE EGEE-II INFSO-RI-031688 2

  3. APROC Introduction I Enabling Grids for E-sciencE • APROC Mission – Provide deployment support facilitating Grid expansion – Maximize the availability of Grid services • Supports EGEE sites in Asia Pacific since April 2005 – 20 production sites, 8 countries – 9 sites joined EGEE since last ISGC: recently HKU, KISTI – 3 sites in certification process  Philippines: Advanced Science and Technology Institute  Korea: KONKUK  Mongolia: (MAS IPT) Mongolian Academy of Sciences EGEE-II INFSO-RI-031688 3

  4. APROC Services Enabling Grids for E-sciencE • Site Deployment Support – Registration – Installation – Certification Operations Support • – Monitoring, troubleshooting – Problem tracking – Software updates and security coordination – Regional VO services - VOMS and LFC • ASGCCA CA Service – provide certificates for AP EGEE/LCG sites without domestic CA. • EGEE Operations – CIC-on-duty: EGEE global operations – Monitoring tool development: GStat and GGUS Search – TPM: Front line user support (Q4 2006) – OSCT: Incident Response duty (Dec 2006) EGEE-II INFSO-RI-031688 4

  5. APROC Usage Enabling Grids for E-sciencE • New Active VOs: Belle and TWGrid • This year: 200 KSI2K Years Last year: 41 KSI2K Years • EGEE-II INFSO-RI-031688 5

  6. APROC Availability Enabling Grids for E-sciencE JS from LHC OPN Remove SSH Hardware Slow BDII upgrade Failure • Daily snapshots of SAM results of 2.4 2.6 2.7 3.0 region 100% Availability increased to 70-80% range – 80% from 60-70% a half year ago SD 60% CT • CT mostly replica management JL 40% failure JS 20% ER – Sensitive to Information System OK access/performance 0% 2005-04 2005-07 2005-10 2006-01 2006-04 2006-07 2006-10 2007-01 – Request that data management clients can failover to secondary BDII • Network Issues – Often the root cause of CT, JL and JS 100 80 – Network congested site set up local top- level BDII 60  40 Increase default update timeout and avail breath time 20 reliab avail 0 2005-04 2005-06 2005-08 2005-10 2005-12 2006-02 2006-04 2006-06 2006-08 2006-10 2006-12 2007-02 EGEE-II INFSO-RI-031688 6

  7. Monitoring and Notification Enabling Grids for E-sciencE • Planned integration of Asset DB • Nagios plugins developed  CE  LFC  VOMS  Storage  IT services  OS Notification via Email • – SMS transmission device currently being tested EGEE-II INFSO-RI-031688 7

  8. Nagios Regional Monitoring Enabling Grids for E-sciencE • Tests run at faster frequency – 5-10 minutes – Faster response to faults Add customized plugins • – Run low level tests for faster isolation of problems – Tests may not be available in global monitoring tools yet – Ability to run tests on the target host via NRPE • Management Interface – Acknowledgement – On demand execution of tests – Historical availability – Test dependencies http://lists.grid.sinica.edu.tw/apwiki/Nagios_monitoring_-_APROC_sites http://lists.grid.sinica.edu.tw/apwiki/Nagios_Plugins_Description EGEE-II INFSO-RI-031688 8

  9. Plans Enabling Grids for E-sciencE • Increase monitoring coverage – Information System – Network performance monitoring  available/achievable bandwidth  Full mesh monitoring Improve troubleshooting tools • – http://lists.grid.sinica.edu.tw/apwiki/APROC/Troubleshooting_Guides – FAQ system – Service diagnostic scripts • Integration of ticketing system with GGUS • Training – EGEE Induction at GridAsia 2007. June 5, 2007 Singapore. EGEE-II INFSO-RI-031688 9

  10. Joining EGEE Infrastructure Enabling Grids for E-sciencE • Contact APROC • If domestic CA is not available – Register as a ASGCCA RA during ISGC • Dedicated an administrator with Unix experience • Allocate servers – 5: UI, CE, WN, DPM, MON – 3: CE/WN, MON, DPM  UI can be installed in user account  Consider Virtual Machine for MON Study user guide and installation manual • • Send configuration file to APROC for review before deployment Complete registration and certification process • EGEE-II INFSO-RI-031688 10

  11. Long Term Operations Enabling Grids for E-sciencE • Establish domestic CA if none exists • Increase availability and resource levels • Establish domestic operations structure – Operations procedures – Tools: monitoring and notification, ticketing system – User and administrator support • Training for administrators and users Collaborate with APROC in Regional operations • • Q: Need for regional experimental Grid? EGEE-II INFSO-RI-031688 11

  12. Issues in AsiaPacific Enabling Grids for E-sciencE • No regional projects to promote collaboration in EGEE • Network bandwidth – Low capacity: regional and last mile – Usage based billing Need for training • – Training for trainers – Application Training – E-Learning material • However EGEE already provides – M/W development and integration – Operations structure, coordination and support – Close to 200 user communities EGEE-II INFSO-RI-031688 12

  13. Summary Enabling Grids for E-sciencE • APROC Provides EGEE operations support services to AsiaPacific • EGEE sites in region has grown to 20 sites with utilization of 200 ksi2k years • We have also improved availability but still is significant room for improvement • We look forward to more site joining EGEE in the region and eht possibility for further collaboration – Applications – Operations • Feedback on what we can improve EGEE-II INFSO-RI-031688 13

  14. Thanks You for Your Attention! Enabling Grids for E-sciencE • Questions? – roc@lists.grid.sinica.edu.tw – http://www.twgrid.org/aproc/ • Thanks to efforts from: – T1/APROC Team  Jason Shih Dave Wei  Felix Lee Joanna Huang  Aries Hong Hung-Che Jen  Jinny Chien Shu-Ting Liao  Yi-Ping Wu Min Tsai EGEE-II INFSO-RI-031688 14

Recommend


More recommend