Physics Computing at CERN Helge Meinhard CERN, IT Department OpenLab S tudent Lecture 21 July 2011
Location (1) Building 513 (opposite of restaurant no. 2)
Building 513 (1) Large building with 2700 m 2 surface for computing equipment, capacity for 2.9 MW electricity and 2.9 MW air and water cooling Chillers Transformers 21 July 2011 Physics computing - Helge Meinhard 3
Building 513 (2) – Ongoing Work Reasons for the upgrade: Cooling: Insufficient cooling for critical UPS room CC not cooled when running on UPS without diesel Insufficient cooling when running on diesel Pumps & Vent ilat ion Units running but no chiller and insufficient st ored cold wat er Power Insufficient critical power available No redundancy for critical UPS (> 240 kW) – currently running at 340kW No redundancy for physics UPS (> 2.7 MW) – aiming to run at 2.9MW by end of year Other Limited fire protection in B513 Critical areas and physics areas strongly coupled S hare t he same locat ions, cooling infrast ructure and fire risks 21 July 2011 Physics computing - Helge Meinhard 4
Building 513 (3) - Ongoing Work S cope of the upgrade: Dedicated cooling infrastructure for critical equipment (decoupled from physics) New building for cooling system (construction that is underway in front of the building) New dedicated room for critical equipment, new electrical rooms and critical ventilation systems in ‘ Barn’ Critical equipment which cannot be moved to new rooms to have new dedicated cooling Networking area and telecoms rooms Increase in critical UPS power to 600kW (with new critical UPS room) and overall power to 3.5MW Restore N+1 redundancy for all UPS systems 21 July 2011 Physics computing - Helge Meinhard 5
Building 513 (4) – Ongoing Work 21 July 2011 Physics computing - Helge Meinhard 6
Location (2) Building 613: S Hosting centre about mall machine room for 15 km from CERN: 35 m 2 , about 100 kW, tape libraries (about 200 m from building critical equipment 513) 21 July 2011 Physics computing - Helge Meinhard 7
Computing Service Categories Two coarse grain computing categories Computing infrastructure Physics data flow and and administrative computing data processing 21 July 2011 Physics computing - Helge Meinhard 8
Task overview Needs underlying Communicat ion t ools: mail, Web, Twiki, GS M, infrastructure … Network and telecom Product ivit y t ools: equipment office software, software Processing, storage development, compiler, and database visualization tools, computing equipment engineering software, … Management and Comput ing capacit y: monitoring software CPU processing, data Maintenance and repositories, personal operations storage, software Authentication and repositories, metadata repositories, … security 21 July 2011 Physics computing - Helge Meinhard 9
CERN CC currently (June 2011) Data Centre Operations (Tier 0) 24x7 operator support and S ystem Administration services to support 24x7 operation of all IT services. Hardware installation & retirement ~7,000 hardware movement s/ year; ~1800 disk failures/ year Management and Automation framework for large scale Linux clusters 10
Infrastructure Services Software environment and productivity tools User registration and authentication 22’000 registered users Web services Mail 10’000 web sites 2 million emails/day, 99% spam 18’000 mail boxes Tool accessibility Windows, Office, CadCam, … Home directories (DFS, AFS) ~400 TB, backup service ~ 2 billion files PC management Software and patch installations Infrastructure needed : 11 > 400 servers
Network Overview ATLAS Experiments All CERN buildings 12’000 active users Central, high speed network backbone 12 Computer centre World Wide Grid centres processing clusters
Monitoring Large scale monitoring S urveillance of all nodes in the computer centre Hundreds of parameters in various time intervals, from minutes to hours, per node and service Data base storage and Interactive visualisation
Bookkeeping: Database Services More than 125 ORACLE LHC machine data base instances on paramet ers > 400 service nodes, total ~ 100 TB Human resource Bookkeeping of physics informat ion event s for t he experiment s Financial bookkeeping Met a dat a for t he Mat erial bookkeeping physics event s (e.g. and mat erial flow det ect or condit ions) Management of dat a cont rol processing LHC and det ect or Highly compressed and const ruct ion det ails filt ered event dat a … … 21 July 2011 Physics computing - Helge Meinhard 14
HEP analyses S tatistical quantities over many collisions Histograms One event doesn’ t prove anything Comparison of statistics from real data with expectations from simulations S imulations based on known models S tatistically significant deviations show that the known models are not sufficient Need more simulated data than real data In order to cover various models In order to be dominated by statistical error of real data, not simulation 21 July 2011 Physics computing - Helge Meinhard 15
detector Data Handling and Computation for Physics Analyses event filter (selection & reconstruction) reconstruction processed event data summary data raw data batch physics event analysis reprocessing analysis analysis objects (extracted by physics topic) les.robertson@cern.ch event simulation simulation interactive physics analysis
Data Flow – online Detector 150 million electronics channels 1 PBytes/s Fast response electronics, Level 1 Filter and Selection FPGA, embedded processors, Limits: very close to the detector Essentially the 150 GBytes/s budget and the downstream data flow pressure O(1000) servers for processing, High Level Filter and Selection Gbit Ethernet Network N x 10 Gbit links to 0.6 GBytes/s the computer centre CERN computer centre 21 July 2011 17
Data Flow – offline 1000 million events/s LHC 4 detectors Filter and first selection 1…2 GB/s Create sub-samples World-wide analysis Physics Explanation of nature 10 GB/s 3 GB/s Store on disk and tape Γ Export copies 2 s σ ≈ σ × 0 Z with + Γ _ _ 2 2 2 2 2 ( s - m ) s / m f f f f Z Z z π Γ Γ 3 12 G m σ = Γ = × + × 0 2 2 ee ff F Z and ( v a ) N _ Γ ff π f f col 2 2 m 6 2 f f Z Z 21 July 2011 Physics computing - Helge Meinhard 18
SI Prefixes Source: wikipedia.org
Data Volumes at CERN Each year: 15 Compare wit h (numbers from mid 2010): Petabytes Library of Congress: 200 TB Tower of CDs: which E-mail (w/ o spam): 30 PB height? 30 t rillion mails at 1 kB each S Phot os: 1 EB tored cumulatively 500 billion phot os at 2 MB each over LHC running 50 PB on Facebook Web: 1 EB Only real data and Telephone calls: 50 EB derivatives S imulated data not … growing exponent ially… included Total of simulated data 20 even larger
Physical and Logical Connectivity Complexity / scale Hardware Software Components CPU, disk, memory, mainbord Operating system, device drivers CPU, disk server Network, interconnects Resource Cluster, management local fabric software Wide area network Grid and cloud World-wide management cluster software 21 July 2011 21
Computing Building Blocks Commodity market components: Tape server = not cheap, but cost effective! Simple components, but many of them CPU server + fibre channel connection CPU server or worker node: + tape drive dual CPU, quad core, 16 or 24 GB memory Disk server = CPU server + RAID controller Market trends more important + 24 SATA disks than technology trends Always watch TCO: Physics computing - Helge Meinhard 22 Total Cost of Ownership
Hardware Management Almost 12’ 000 servers installed in centre Assume 3...4 years lifetime for the equipment Key factors: power efficiency, performance, reliability Demands by experiments require investments of ~ 15 MCHF/ year for new PC hardware and infrastructure Infrastructure and operation setup needed for ~3’ 500 nodes installed per year ~3’ 500 nodes removed per year Installation in racks, cabling, automatic installation, Linux software environment 21 July 2011 Physics computing - Helge Meinhard 23
Functional Units Detectors Data import and export Disk storage Tape storage ‘Active’ archive and backup Event processing capacity Meta-data storage CPU server Data bases 21 July 2011 Physics computing - Helge Meinhard 24
Recommend
More recommend