fjppl computing workshop
play

FJPPL Computing Workshop Operational experience with second machine - PowerPoint PPT Presentation

Centre de Calcul de lInstitut National de Physique Nuclaire et de Physique des Particules FJPPL Computing Workshop Operational experience with second machine room at CC-IN2P3 Xavier Canehan Introduction 2 computing rooms at CC-IN2P3


  1. Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules FJPPL Computing Workshop Operational experience with second machine room at CC-IN2P3 Xavier Canehan

  2. Introduction  2 computing rooms at CC-IN2P3 since 2011  Critical choices upon conception lead to consequent advantages  Adaptability remains mandatory  Monitoring and testing even the building  Drawbacks 2 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  3. Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules 2 computing rooms Square feets and high power consumption 3 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  4. Vil-2 initial objectives +10 year perspective Hot water Modernity Modularity Ease of Multi-Tier deployment architecture 4 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  5. High Quality results of Initial Conception  Modern computing room (details shown during visit) Initial plan for 2011 - 2019  Multi-tier by design Target:  3 phase deployment 2011 2015 2019 ◦ first one dedicated to 50 racks 125 racks 240 racks 0.6 MW 1.5 MW 3.6 MW computing farm ◦ relying upon regular Tier II Tier III Tier III-IV budget 5 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  6. First phase: Tier II over 2 lines 28 InRow Cooling units,18 – 20 kW each  One 2 MVA UPS chain of 4 * 500 kVA UPS  2 transformers of 1600 kVA  3 chilling units for 2,4 MW, only one distribution circuit.  Backup through a 24m^3 water tank. 2 power lines: dedicated main up to 9 MW, 2 MW  TIER II reservation on backup line ⅓ floor space used at each level  TIER II 6 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  7. Site mean power usage, PUE advantage to Vil-2 Site power er co consumpti sumption on 800 ar around nd 1. 1.1 1 MW MW 700 600 Mean Power wer Usage age IT agai ainst nst Total al Room kW IT kW Total PUE PUE 1 (kW kW) 500 kW IT Vil-1 320 720 (-130) 1.84 400 kW Total Vil-2 300 440 1.46 300 200 Best PUE UE in in Vil Vil-2 100 Movin ing from Vil il-1 to V Vil il-2 0 gain ins s ~20% of of power cost Vil-1 Vil-2 7 7 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  8. Entry ticket cost vs fully functional Vil-1  PUE is not linear: works by step, intercept is not null ◦ Beside investment costs ◦ Operational cost of electrical infrastructure must be taken in account  eg 1 UPS consumes up to 3kW  Other costs of investment ◦ Water cooled racks value ◦ PDU and rack power protection costs  Vil-1 is fully redundant, deals with hygrometry  Vil-2 initial target was deliberately limited No p poin int t in in d dit itchi hing ng Vi Vil-1 8 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  9. Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Moving grounds Adaptation remains mandatory 9 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  10. Have to evolve  Environment changes  Needs and perspectives clarify and evolve  IT technology is volatile  Infrastructure pace stays slower  Monitoring everything 10 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  11. Environment – IT densification effects upon racks  Space floor is no more a problem kW/m² increase Power  14 C6200 Dell sockets # per rack PowerEdge will ◦ approach InRow Cooling IT densification unit capacity (18kW) ◦ need 45 sockets, 3 phases PDU ◦ (dedicated PDU development) At rack limits Partially filled racks to limit constraints 11 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  12. Environment – Power costs evolution  5% to 7% of annual /annual power consumption power cost increase /annual power cost, without tax / kWh cost, in € cents  Looking for IT efficiency  Fine tuned power contract helps to minimize costs Se Seek the more effi ficien cient IT ha hardwa ware in the most effici ficient ent room 12 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  13. Power efficiency, ASHRAE recommendations  Actual hardware bears 35 ° C all year long  Increasing room temperature lowers overall power consumption 25°C server -40% fan entry activity for 11 kW Gain setpoint cooling units  Less noise: 95-110 dB to 85-95 dB  Hot corridor temperature also increased Corridor temperature 36 36°C-40 40°C What will be next setpoint ? 13 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  14. Planning to cope with scientific needs in foreseable future Estimated imated #Racks ks What if all new hardware goes into Vil-2 ? 140 120 Estimated imated IT power er [kW] W] 100 2500 80 2000 60 40 1500 20 1000 0 2014 2015 2016 2017 2018 2019 500 Estimations from LHC and LSST/Euclid/CTA figures • Data modified with current densification factor • 0 Need Vil-2 adaptation to host storage systems 14 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  15. Initial Plans revision  Previous power and cooling distribution systems implied evolution by pair of hot corridors Infrastr astruc uctur ture e cost for each new la lane: ~1.5 .5 M €  Minimizing costs by actual infrastructure hardware reuse Adapt pting ing exis istin ting g le least used ed Tie ier II II la lane 2 p phase ase pla lan le less s than 350 k k € per year  2017 aim: ◦ 80 racks, 1 MW IT ◦ 1 lane Tier II and 1 lane Tier III 15 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  16. Introducing Tier III – Phase 1, power redundancy 2015 15 TIER III Hot Aisle C/D Used exten tension on 2015 2015 Used Hot Aisle A/B TIER II ext. details to be seen during visit 16 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  17. Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Drawbacks and limits From details to major drawbacks 17 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  18. Infrastructure limits are easily forgotten  Coping with interdependant limits ◦ Cooling capacity or cooling redundancy ◦ Power capacity and Power redundancy ◦ Per rack, group of racks, aisle, distribution line  Multi-tier ability adds an order of complexity ◦ Event more if you mix tiers in a single line  Strict deployment plans needed Monit itoring ring is is manda dato tory 18 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  19. Dealing with cooling defect Stopping InRow cooling units for 20 minutes → +15 ° C Increases corridor temperature around 49 ° C Increases front temperature to 43 ° C Need an effi ficient ient shut utdow down system tem Our water tank provide ides 20 m min in dela lay 19 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  20. Smart versus dumb shutdown systems  Smart shutdown relies upon IPMI ◦ detects low continuous slope ◦ or fast temperature change  IPMI needs network  If network switches shut down before servers, IPMI is useless  Need a dumb backup power cut system Won’t reproduce a bad experiment with a water leak -20 ° C on roof, +65 ° C in hot corridor 20 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  21. Cooling technology: hot water choice  Improving global Campus Energy Reuse Efficiency ? • Agreement to provide hot water on campus Land nd procu ocureme rement nt • Very efficent chillers, allowing reuse of hot water • Silent hardware Hot t water ter need ed • Campus is late • 3 years spent Fixed xed tech chnolo ology 21 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  22. No humans: no window, no faucet ? 22 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  23. Cooling technology: improving efficiency  Won’t ever be able to use direct Free Cooling  But new cooling technologies are available New cooling Change IT technologies procurement 23 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  24. Very valuable modularity outcomes  Ceiling rails  Preset pipes  Movable separation wall between used and free space  Roof as technical level 24 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

  25. Questions? Thank you! 25 FJPPL Computing Workshop – Operational Experience – Xavier Canehan 10/03/2015

Recommend


More recommend