Third Annual Workshop-Indian Institute of Technology, Guwahati (IITG) Regional WLCG connectivity in India and LHC Open Network Environment Brij Kishor Jashal Tata Institute of Fundamental Research, Mumbai Email – brij.jashal@tifr.res.in brij.Kishor.jashal@cern.ch 17 th Dec 2014 1
Agenda LHC data challenges and evolution of LHC data access model • LHC open network environment - LHCONE • India-CMS site infrastructure • Network at T2_IN_TIFR • Expectations from networks for LHC Run2 • WLCG regional connectivity in India by NKN • 2
Complexity of LHC experiments 1 Billion collisions per second (for each experiment) Each collision generates particles that often decay in complex ways into even more particles. Electronic circuits record the passage of each particle through a detector as a series of electronic signals, and send the data to the CERN Data Centre (DC) for digital reconstruction. 1 million Gigabytes (1PB) per second of raw data The digitized summary is recorded as a "collision event". Physicists must sift through the data produced to determine if the collisions have thrown up any interesting physics. Source – cern.ch
4 Source: Bill Johnston, ESNet
Evolution of computing models -1 When in the late nineties the computing • models for the 4 LHC experiments were developed, networking was still a scarce resource and almost all models reflect this ,although in different ways Instead of relying on the ability to provide • the data when needed for analysis, the data is replicated to many places on the grid shortly after it has been produced to be readily available for user analysis This model has been proved to work well • for the early stages of analysis but is limited by the ever-increasing need for disk space as the data volume from the machine increases with time 5
Evolution of computing models Bos-Fisk report 1 on Tier 2 (T2) requirements for the Large • Hadron Collider (LHC) From Hierarchical to distributed • Network is the most reliable resource for • LHC computing. This makes it possible to reconsider the • data models and not rely on pre-placement of the data for analysis and running the jobs only where the data is. Instead, jobs could pull the data from somewhere else if not already available locally It will depend on the data needs to decide • whether to copy the data locally or to access the data remotely If the data is copied locally the storage • turns into a cache that is likely to hold the selection of the data that is most popular for analysis at that time 6
Managing large scale science traffic in a shared infrastructur e The traffic from T0 to T1 was well served by LHCOPEN • The traffic from the Tier 1 data centres to the Tier 2 sites (mostly • universities) where the data analysis is done is now large enough that it must be managed separately from the general R&E traffic In aggregate the Tier 1 to Tier 2 traffic is equal to the Tier 0 to Tier 1 • (there are about 170 Tier 2 sites) Managing this with all possible combinations of Tier 2 • Special infrastructure was required for this: • The LHC’s Open Network Environment LHCONE was designed for this purpose LHCONE is an overlay network supported by Over 15 national and • international RENs Several Open Exchange Points including NetherLight, StarLight, MANLAN and • others Trans-Atlantic connectivity provided by ACE, GEANT, NORDUNET and • USLHCNET Over 50 end sites connected to LHCONE • 45 Tier 2s 10 Tier 1s https://twiki.cern.ch/twiki/bin/view/LHCONE/LhcOneVRF#Connected_sites APAN-38 7
APAN-38 8 Source:- Bill Johnston, ESNet
LHCONE L3VPN architecture TierX sites connected to National-VRFs or Continental-VRFs What LHCONE L3VPN is: Layer3 (routed) Virtual Private Network • Dedicated worldwide backbone connecting Tier1s and Tier2s (and Tier3s) at high • bandwidth Reserved to LHC data transfers and analysis • 9
10
The TierX site needs to have: Public IP addresses • A public Autonomous System (AS) number • A BGP capable router • LHCONE Acceptable Use Policy (AUP): Use of LHCONE should be restricted to WLCG related traffic • IP addresses announced to LHCONE: • - should be assigned only to WLCG servers - cannot be assigned to generic campus devices (desktop and portable computers, wireless devices, printers, VOIP phones....) Routing setup A BGP peering is established between the TierX and the VRF border routers • The TierX announce only the IP subnets used for WLCG servers • The TierX accepts all the prefixes announced by the LHCONE VRF router • The TierX must ensure traffic symmetry: Injects only packets sourced by • the announced subnets 11
Symmetric paths must be ensured To achieve symmetry, we can use any one of the following techniques: - Policy Base Routing (source-destination routing) - Physically Separated networks - Virtually separated networks (VRF) - Science DMZ Policy Base Routing (source-destination routing) If a single border router is used to connect to the Internet and LHCONE, source-destination routing must be used 12
Virtually separated networks (VRF) Traffic separation is achieved with Virtual Routing instances on the same physical box VRF = Virtual Routing Forwarding (virtual routing instance) Physically Separated networks Different routers can be used for Generic and LCG Hosts 13
14
15
T2_IN_TIFR Resources Computing Total no if physical cores 1024, Total average of runs executed on a machine • ( Special Performance Evaluation for HEP code ) i.e HEP-SPEC06 is 7218.12 Storage Total Storage capacity of 28 DPM Disk Nodes is aggregated to more than 1PB • (1020 TB) Regional XrootD federation redirector for India • Networking Dedicated P2P link to LHCONE, upgraded to 4 G guaranteed with best effort up to 10 G • Two 2.5 Gigabit links – one to Europe and other to Singapore through TEIN4 • Internal network upgraded with new chassis switch with total switching throughput of • 3.5 Tbps and new storage nodes upgraded with 10 G link. T2-IN-TIFR is a part of LHCONE via CERNLite from it’s inception. • 16
Network at T2_IN_TIFR Last week GEANT upgraded backbone link between CERNLite router (where Indian link terminates at CERN to GEANT POP) to 10G 17
WLCG site availability and reliability report India-CMS TIFR In 2014 (Dec-13 to Sep-14) Site Availability Site Relaiability 120% 100% 100% 100% 100% 100% 98% 98% 98% 98% 96% 96% 96% 93% 93% 93% 85% 85% 81% 81% 80% 76% 74% 60% 40% 20% 0% Dec-13 Jan-14 Feb-14 Mar-14 Apr-14 May-14 Jun-14 Jul-14 Aug-14 Sep-14 Dec-13 Jan-14 Feb-14 Mar-14 Apr-14 May-14 Jun-14 Jul-14 Aug-14 Sep-14 Site Availability 81% 98% 74% 96% 93% 93% 85% 98% 96% 98% Site Relaiability 81% 100% 76% 100% 100% 93% 85% 98% 96% 100% CPU Usage in Month (HEPSPEC06-Hrs) 250,432 6,01,448 3,23,452 11,86,232 12,03,460 11,98,572 9,44,016 10,86,608 8,71,916 7,31,964 Source: https://espace2013.cern.ch/WLCG-document-repository/ReliabilityAvailability/Forms/AllItems.aspx?RootFolder=%2fWLCG-document- repository%2fReliabilityAvailability%2f2014&FolderCTID=0x0120000B00526BA2DC7C4AAEB1B0E517F001F8 18
19
20
21
22
Total cumulative data transfers for last five months is 700 TB Production Instance Downloads Debug Instance Downloads Production Instance Uploads Debug Instance Uploads APAN-38 23
Country wise WLCG traffic to India Total volume in last one year - 557 TB Austria, 0.972 China, 0.000829 Finland, 0.008 Estonia, 0.01 Brazil, 0.025 Belgium, 3 France, 21 Only Germany, 68 FTS +PHEdeX Hungary, 0.824 USA, 194 Swiss / Italy , 156 Ukrain, 0.004 UK, 57 Taiwan, 2 Switzerland, 5 Netherland, 8 Others, 5 Spain, 18 Portugal, 5 Russian-federation, 0.102 South Koria, 1 Russia, 17 Austria Belgium Brazil China Estonia Finland France Germany Hungary Italy Netherland Others Portugal Russia Russian-federation South Koria Spain Switzerland Taiwan UK Ukrain USA Period 2013-12-01 00:00 UTC to 2014-12-15 00:00 UTC 24
Country wise WLCG traffic from India ( FTS + PhEDEX) Total Volume – 408 TB Only FTS +PHEdeX Belgium Estonia Austria Others China Brazil France USA Germany Hungary India 0% Italy 0% 0% 0% 0% 0% 0% 2% 4% 3% 0% South Koria 4% 1% UK 41% Spain 39% Taiwan Swiss 3% 1% Austria Belgium Brazil China Estonia France Germany Hungary India Italy South Koria Spain Swiss Taiwan UK USA Others Period 2013-12-01 00:00 UTC to 2014-12-15 00:00 UTC 25
Recommend
More recommend