1 JGW-G1706749-v2 KAGRA data management and analysis N.Kanda, on behalf of KAGRA collaboration The 3rd KAGRA International Workshop (KIW3) May 21-22, 2017 Academia Sinica (NTU campus), Taipei
2 KAGRA data related subgroups CAL Interferometer : Calibration Data DGS : Digital System computing environment DMG storage : Data Management CPUs DAS softwares : Data Analysis Results
3 DMG (Data Management subgroup) Targets / Tasks to manage and to operate ‘KAGRA data Tier’ - Data Transfer - Data Archive + Development of softwares for these operations - Data Distribution mirror sites and end users Members leader : N.Kanda sub-leader : K.Oohara e ff ort members : S.Haino, K.Hayama, Y.Inoue, Y.Itoh, M.Kaneyama, G. Kang, C-Y Lin, O.Miyakawa, A.Miyamoto, S.Miyoki, S.Oh, K.Sakai, Y.Sasaki, E. J. Son, H.Tagoshi, H.Takahashi, K.Tanaka, S.Ueki, T.Yamamoto, T.Yokozawa, H.Yuzurihara
4 Overall of KAGRA data transfer Overview of KAGRA data flow Oversea GW Oversea GW Data Sharing experiments experiments with Other (low latency h(t)) (bulk of data) obs. less amount ↔ larger amount raw data raw data Mirrors: ~20MB/s ~20MB/s Kashiwa Academia Tier-1 Detector SINICA (Taiwan) Kamioka Tier-0 Tier-0 + (tunnel) KISTI (Korea) archive raw + Proc. Proc. data Tier-1 archive data ~1MB/s Proc. data + Osaka City U. partial raw data Tier-0.5 for low Proc. data RESCEU latency Tier-0.5 ~1MB/s Nagaoka Tech. alert data Tier-2 (option : raw data base without permanent store) Niigata U, in KAGRA end user sites partial raw&proc. Tier-3 data set … socket (KAGRA DMG software) GRID GRID or alternative Event Alert to follow-ups / counterparts low latency (h(t)) … Counterparts/ Alart in GCN format Follow-ups faster (upstream) ↔ later (downstream)
5 KAGRA data All data sets are in ‘frame’ format. RAW Data files derived from DGS’s frame writer Tier-0 : primary fully and permanently archive at Kashiwa Tier-1 : mirror of Tier-1 at Academia SINICA (Taiwan) and KISTI (Korea) KISTI has iKAGRA data now. It will be extend to ‘full’ data in near future. Tier-0.5 : low latency transfer to OCU (Osaka) same to Tier-0, but will not keep full sets permanently. Proc. Processed data set that consists of calibrated strain h(t) and some related channels for GW event analysis. at Tier-0, 1, 0.5 at Tier-2 : RESCEU (Tokyo), Niigata, +more future ——— iKAGRA data 2016/3/25-31 : lock duration 101.93 hours 2016/4/11-25 : lock duration 296.18 hours ~7.5 TB of raw data, 756 GB of proc data
6 Tier-0 archive Surface building at Kamioka ICRR, U Tokyo. (Kashiwa campus) x2sets frame writer --> data server 200 TiB luster file system mid spool of the data, HUB of data transfer KAGRA Tunnel site iKAGRA :100TiB bKAGRA : 3PB A bucket brigade of the data SINET IFO, DSG frontend Environmental Monitor data concentrator VPN hostname (EPICS layer) gwave_kamioka VPN gwave_kashiwa low latency IP address IP address ICRR interoperable hostname computer system IP address IP address HUB IP address 10G login / job man. login / job man. HUB 10G / 1G taurus-01 taurus-02 data transfer HUB aldebaran 10G / 1G IP address IP address frame writer login server login server frame writer IP address k1fw0 perseus-01 perseus-02 k1fw1 IP address IP address IP address IP address disk array calculation calculation calculation calculation pleiades-01 pleiades-02 pleiades-03 pleiades-04 primary data server primary data server IP address IP address IP address IP address k1dm0 / hyades-0 k1dm1 / hyades-1 Infiniband 20TB 20TB SW IP address IP address MDS/OSS NDS algol-01 hostname HUB IP address 10G IP address Infiniband Infiniband Dedicated optical fiber SW SW 4.5 km, tunnel <-> surface build. DetChar MDT/OST hostname HUB disk array 1G IP address monitor hosts MDS MDS OSS OSS in control room crab-mds-01 crab-mds-02 crab-oss-01 crab-oss-02 IP address IP address IP address IP address MDT OST OST disk array disk array disk array iKAGRA data system overview Drawn by N.Kanda last update : 2014/8/19 KAGRA’s VPN connection
最頻値は以下の値となった( 7 転送遅延のヒストグラム 幅: Data transfer We developed the transfer software We writing the code fully. (No black-box!) Using Linux standard tools as possible as. Speed and Stability is good for bKAGRA requirement. bKAGRA requirement : >20MB/s Performance : >40MB/s between Kamika and Kashiwa/Osaka —> Performance test for 1sec data chunk from Kamioka to Osaka (by K.Sakai)
KAGRA’s VPN 8 to Osaka Linux cluster(s) at Osaka City University for the development of KAGRA analysis for low-latency search cluster system CPU/Storage 760 800 CPU [cores] (for calculation) cores Storage [TB] (/data, /home) 600 Storage [TB] # of cores 400 304 TB 200 0 2012 2013 2014 2015 2016 ↑ iKAGRA run Fiscal Year
9 Tier-1 Mirroring Academia SINICA (Taiwan) started continuous mirroring at iKAGRA 2nd half with GRID computing technique. drawing by S.Haino Checksum verification ICRR VPN ASGC KAGRA DMG VPN (Kashiwa) (Kamioka-Kashiwa) (Taiwan) GRID server Tier-0 archive Prototype PC (Tier-1) GRID transfer SSH access over VPN KISTI (Korea) archived iKAGRA data sets. We will establish automatic mirroring soon.
10 New main storage (System-A) iKAGRA data storage KAGRA data systems ( iKAGRA & New main storge) VPN
11 New Peta-byte class system for bKAGRA era Working since March 2017 2.4 PiB (HHD) for observational data storage gpfs file system 12.8 TFLOPS Storage: DDN SFA7700X + SS8460 Servers: HP ProLiant DL180 G9 HP ProLiant DL20 Gen9 HP ProLiant XL170 Gen9 Internal network: Infiniband FDR
h(t) push==> 12 Extend in 2017 push==> (current) all data spool ~100TiB partial data (h(t), latest raw) Extend the data transfer to new system Kashiwa Campus The internat (SINET) Kamioka ⇐ VPN router Fire wall KAGRA Private Network (working) Gateway Server Gateway Server ICRR shared computer ==> pull Internal Network (working) Data Reciever Login Server Login Server Data Sender iKAGRA data transfer and storage system (working) Hi-Speed 100 TiB Network Compute Compute for Server Server iKAGRA era Mass Storage Compute Compute Server Server (Disk Array) Compute Compute Server Server 2.4 PiB Compute Compute data storage Server Server KAGRA Main Data Storage (System A)
まとめ ( ) 13 まとめ Current storages Site Capacity Main Usage Kamioka (surface) 200 TiB spool, On-site analysis Kashiwa 100 TiB + 2.4 PiB iKAGRA data storeage CBC, Burst, low latency Osaka City Univ. 304 TiB search RESCEU 80 TiB CW Niigata Univ. 77 TiB misc. analysis Academia SINICA (Taiwan 220 TiB Tier-1 mirroring, etc. Group) 150TB (800TB in 2018 and Tier-1, Detector KISTI-GSDC (Korea Group) 2019) characterization ( total ) ~ 3.5 PiB - We also cooperate with: Site Capacity Main Usage ICRR computer for cooperative researches proc data, event analysis KEK computer center (not yet decided)
14 Addendum Kashiwa “M31” is system-A’s login servers Osaka Tokyo, RESCEU “KANBAI( 寒梅 )”
15 Data Analysis Target: Search for Gravitational Waves in KAGRA data Extract Science of GWs Development of Tools for them (=Software, Computers, etc.) Data analysis is one of cooperation channels between other observations: LIGO (including Inidia), Virgo, GEO J-GEM (Japanese collaboration for Gravitational-wave Electro-Magnetic follow-up) Neutrino observations etc. We have been cooperated with these partners in these several years.
Recommend
More recommend