glideinwms
play

GlideinWMS Marco Mambelli Stakeholders Meeting May 11, 2018 - PowerPoint PPT Presentation

GlideinWMS Marco Mambelli Stakeholders Meeting May 11, 2018 Overview Releases since last stakeholders meeting Upcoming releases Current focus GlideinWMS roadmap Reference slides GlideinWMS Architecture Quick Facts


  1. GlideinWMS Marco Mambelli Stakeholders Meeting May 11, 2018

  2. Overview • Releases since last stakeholder’s meeting • Upcoming releases • Current focus • GlideinWMS roadmap • Reference slides – GlideinWMS Architecture – Quick Facts – Releases since last stakeholders meeting 2 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  3. Releases Since Last Stakeholders Meeting • v3_2_22 released on April 10 – Bug Fix: Incorrect behavior of Singularity – Bug Fix: proxy-renewal-script updates and bug fixes – Bug Fix: Protection against malformed Frontend messages and hardening of forked processes • v3_2_22_1 and v3_2_22_2 followed shortly after on April 11 and 17 to adapt to new Singularity 2.4.6 requirement and because I did only a partial fix in rushing 3.2.22.1 – Fixes to the proxy-renewal-script (OSG contributed) were also added • v3_3_3 (Development series) released on April 17 – Includes all features and bug fixes released in v3_2_22_2 3 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  4. Next Planned Release • v3_4 planned for May 24 – Merging of production and development branches (v3.2 and v3.3), will bring Google CE support and policy plugin to the production version – Code modernization to Python 2.7 (and 2.6) standards – Increase number and coverage of the unit tests Tickets per release 30 • 10k lines code change 25 20 • Doubled unit test coverage 15 • More than doubled tests 10 5 0 3.4 3.2.22.2 3.2.21 Features Bug fix Other Total 4 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  5. Next Planned Release (cont) • v3_4 planned for May 24 – Glidein lifetime not based anymore on the length of the proxy – Internal support of condor_switchboard (discontinued by HTCondor) – New option to kill glideins when job requests decrease – Estimate in advance the cores provided to glideins discovering cores automatically – Add entry monitoring breakdown for metasites – Review Factory and Frontend tools, especially glidien_off and manual_glidein_submit.py 5 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  6. GlideinWMS: Current Focus • Improve stability – More automated testing & CI (pylint, pythoscope, futurize, unittest …) is ongoing focus – Developer’s test infrastructure to connect to Factory ITB services for scale testing – External contributions should be production ready • Minimize wastage of resources from over-provisioning – Consider site topology – AUTO estimate – Actively follow the requests and adapt as the request goes down – Solution addressed in phases • First phase of the solution is available in v3.2.21, next in 3.4 • Consider ”transactional provisioning” • Containerization – Singularity support changes • Security – Adapt to sites with tighter security restrictions • Support for shorter proxy lifetime – Impacts how we determine lifetime of a glidein – Successful test w/ FIFE 6 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  7. GlideinWMS Roadmap • Medium term (2018 – mid 2019) – Keep up with the scalability requirements • Investigate and incorporate new technologies like pandas dataframes, numpy, etc – Outsource GlideinWMS functionality to the HTCondor • Work with the HTCondor team to provide some of the frontend functionality natively through HTCondor – Leaner & modular Frontend • Adapt to changes/introduction of Acquisition Engine by HTCondor – Dependent on the work that will be done in HTCondor in future • Very thin GlideinWMS factory – Support for new HPC sites with stricter policies (e.g. no outbound connection except gateways, MFA) • Depends on support from HTCondor. Discussion with the HTCondor team next week. – Monitoring Modernization • Retire GlideinWMS monitoring pages • Move to grafana/graphite/elastic search based solution 7 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  8. GlideinWMS Roadmap • Long term (> mid-2019) – Moving to Decision Engine (DE) • Replace frontend with the Decision Engine – Make Glidein as a service capable of talking to multiple WMS middleware/frameworks 8 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  9. Questions/Comments 9 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  10. Reference Slides 10 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  11. GlideinWMS NOTE: HTCondor condor submit HTCondor HTCondor Schedulers Frontend can talk to multiple factories Schedulers Central Manager Factory can serve multiple frontends VO Frontend VO Frontend Pull Job Grid Site 2006 HTCondor-G Glidein HTCondor GlideinWMS Factory Job Startd Virtual Machine WN/VM 2012 2014 2014 HTCondor CE Super Computers Clouds (AWS/OpenStack (via BOSCO) OpenNebula) Job Job Job Virtual Machine Virtual Machine Virtual Machine 11 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  12. GlideinWMS: Quick Facts • GlideinWMS is an open-source product (http://tinyurl.com/glideinWMS) • Heavy reliance on HTCondor (UW Madison) and we work closely with them • Effort: Role Resources Effort (FTE) Project Mgmt/Lead Parag Mhashilkar (0.15 USCMS) 0.15 Development Parag Mhashilkar (0.20 SCD) 2.45 & Marco Mambelli (1 SCD) Support Dennis Box (0.75 SCD) Marco Mascheroni (0.5 CMS - Contractor) TOTAL 2.60 Table: Current Resources & Roles 12 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  13. Quick Facts: Releases & Support Structure • Releases – Issues tracked in redmine issue tracker • https://cdcvs.fnal.gov/redmine/projects/glideinwms/issues • Categorized and prioritized based on impact, urgency and requester – Issues are now associated with respective stakeholders • Issues are assigned based on developer’s expertise and other workload • Roadmap for upcoming releases available in redmine (See reference slides) – SCM • All releases are version controlled and tagged • http://glideinwms.fnal.gov/doc.prd/download.html – Release notes & history • http://glideinwms.fnal.gov/doc.prd/history.html • Support – Entire development team is responsible for support 13 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  14. Quick Facts: Project Status & Communication Channels • Project meeting: Wednesdays 10 – 11 am – Technical discussions & status updates – Regular stakeholder participation – Contact Parag Mhashilkar if you need invite for this meeting • Quarterly Stakeholders Meeting • Project Management – Project Status reported monthly at CS Project status meetings Area of Interest Mailing Lists Support glideinwms-support@fnal.gov Stakeholders glideinwms-stakeholders@fnal.gov Release Announcements glideinwms-support@fnal.gov cms-dct-wms@fnal.gov glideinwms-stakeholders@fnal.gov Future Release plans See next slide Discussions glideinwms-discuss@fnal.gov Code commits glideinwms-commit@fnal.gov Twitter Tag: @glideinwms 14 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

  15. Tracking Releases in Redmine 1. Visit the redmine issues tab for GlideinWMS or the URL Default tabs not too useful 2. Click custom query for stakeholder or version roadmap 15 Marco Mambelli | GlideinWMS - Stakeholders Meeting 05/11/2018

Recommend


More recommend