ONSEN Lab Tests and Development 28–31 May 2017 21st DEPFET Ws., May 2017 ONSEN Lab Tests and Development Thomas Geßler (JLU Gießen) funded by European Union, grant n.644294 JENNIFER (Japan and Europe Network for Neutrino and Intensity Frontier Experimental Research), an MSCA-RISE project Ringberg Castle 21st International Workshop on DEPFET Detectors and Applications Thomas Geßler II. Physikalisches Institut, Justus-Liebig-Universität Gießen Simon Reiter Klemens Lautenbach Jens Sören Lange Wolfgang Kühn Dennis Getzkow 1 / 21
Debugging of Data Corruption Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 2 / 21
Selector Data Flow: “ONSEN Trigger Mismatch” Sources 3 Pixel fjlter Event trigger? Read Addr. Refor- mater Writer Reader 1 3 1 1 Write 10 DHC data (6.25 Gbps optical) Merged ROIs (MGT or LVDS) Filtered data (GbE) Memory Addr. FIFO Internal event synchronization goes out of sync Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 PXD Addr. # 41 parser Writer Reader ROI parser empty empty Addr(42) Addr(41) Addr(40) 44 Event 42 43 40 15 11 12 13 14 3 / 21 Addr(11) Addr(12) Addr(13) Addr(14) Addr(15) · · · · · · · · · · · · # HLT ◮ PXD parser: Sensitive to event/framing errors in PXD data → ◮ Pixel fjlter: Produces framing errors if inputs out of sync; state machine reset not properly implemented → Cold start necessary
Test Setup in Gießen Data Reduction - ONSEN ATCA Shelf RoI Carrier Carrier Board Board MERGER SELECTOR SELECTOR Optical links Optical links Optical links ‘‘ HLT‘‘ / Fork v2 Fork v2 Fork v2 TCP (corrupts frames (corrupts frames (corrupts frames on purpose) on purpose) on purpose) ‘‘ DATCON‘‘ / TCP ‘‘ DHC 1‘‘ / TCP ‘‘ DHC 2‘‘ / TCP PC 2/4 April / May 2017 Dennis Getzkow Firmware Tests
Problematic Conditions Invalid CRC in HLT frame Fragmented first DHH data Double DHC Start frame at (cut in middle of ZSD) the beginning of run ONSEN Merger discarded Internal buffer management Additional / faulty DHC Start this data but also the did not end the event was not discarded properly beginning of the next (valid) properly in ONSEN HLT frame ONSEN Merger blocked Resulted in “ event fusion ” Lead to event mismatch any incoming HLT data of the first DHH data but also to internal after invalid CRC (DHE data of second event backpressure started in ZSD of first event) (Selector AMC) Corrupted “only” two events Softreset needed for getting rid of backpressure but event mismatch was still occuring Coldstart needed No coldstart needed Coldstart needed 4/4 ONSEN firmware update: these three conditions don’t cause trouble anymore April / May 2017 Dennis Getzkow Firmware Tests
ONSEN Emulator 0xBE12DA7A header) 1000 events, 40 % PXD occupancy = 780 MB pixel data: Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 4 / 21 ◮ C++-Program by S. Reiter, PXD data reduction in software ◮ Loads test data from fjle (PXD/HLT/DATCON) (requires ◮ Similar memory management as ONSEN ◮ Processing time example ◮ ONSEN: < 2 seconds after sending HLT (1 Selector node) ◮ Emulator: (Intel i7 @ 3.4 GHz, 16 GB RAM): ◮ 11 min, 50 sec with 1 thread ◮ 2 min, 40 sec with 8 threads
Test Results and Progress data, identifjed, and fjxed Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 5 / 21 ◮ Several framing and event error cases were reproduced with test ◮ Verifjed by testing with corrupt test data → patched PXD parser correctly sanitizes PXD input ◮ Correct ONSEN data processing verifjed with software emulator ◮ Next steps: ◮ Replace with rewritten PXD parser → increases robustness and adds better error analysis ◮ Do the same thing for DATCON parser ◮ Revise Pixel fjlter input state-machine, fjx reset
Debugging of ONSEN Internal Links Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 6 / 21
ONSEN Internal Links (for ROI Forwarding) S 21st DEPFET Ws., May 2017 ONSEN Lab Tests and Development Thomas Geßler (JLU Gießen) 3.125 Gbps MGT ATCA Backplane links (fabric channels) ATCA backplane S S S Events 1, 5, … S S S S Events 4, 8, … M S S S S S S S S 7 / 21 · · · · · · 4 × 600 Mbps LVDS Carrier-AMC links ◮ ◮
MGT-Links on ATCA Backplane S S P E S S P E S P E E S S P E S S P E S P S P 1 P E S S P E S S E S S S P E S S P E S S P E S S S P E S S P E S scaled from 2 to 9 ATCA P boards troubling results: With “Merger” fjrmware sending to multiple boards , all backplane links become unstable in Gießen Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 E S E S S S P E S S P E S P S E S S P E S S P E S S P P S P E S S P E S S E E S S P E S S P E S S P P 11 2 3 4 5 6 7 8 9 10 12 S 13 14 D H S M P E S S S E P S P E S S P E S S E E S S P E S S P E S S S P P S P E S S P E S S S 8 / 21 P S S S P E S E S E ◮ For Belle II, ONSEN will be ◮ First scaling tests showed ◮ Debugged in detail by S. Reiter ◮ Problem: Crosstalk between Ethernet IO and one MGT power supply ◮ Solved by avoiding that link → use difgerent ATCA slots ◮ Additionally, Aurora reset logic had to be revised
LVDS-Links Between Carrier and AMCs tuning 21st DEPFET Ws., May 2017 ONSEN Lab Tests and Development Thomas Geßler (JLU Gießen) combinations 9 / 21 Serial data IDELAY IDELAY (4 x 600 Mbps per AMC) Carrier AMC FPGA FPGA System clock Serial clock Fanout DCM PLL (100 MHz) (300 MHz) Chip ◮ Connection Carrier FPGA AMC FPGA uses serial (LVDS) links ◮ Serial clock is distributed from Carrier to AMCs ◮ Clock/data phase shift is compensated by delay , determined by ◮ Problem 1: Strong delay difgerence between Carrier/AMC ◮ Problem 2: Small temperature drift of the delay ◮ Solved by implementation of online self-calibration mechanism
Link Tests After Fixes output from emulator) 21st DEPFET Ws., May 2017 ONSEN Lab Tests and Development Thomas Geßler (JLU Gießen) from PERSY and repair at IHEP 10 / 21 ◮ Forward HLT packets from ◮ 1 Merger in ◮ 1 Merger-Carrier to ◮ 4 Selector-Carriers with ◮ 2 Selectors each ◮ Selector output recorded and verifjed (i.e., compared to expected ◮ 60-hour test with low rate ( ∼ 10 Hz): no link or data errors ◮ 30-minute test 2 kHz: no link or data errors ◮ Short test with 30 kHz and DATCON data: no link or data errors ◮ Next: Full-scale test (8 Selector-Carriers) when all boards return
Phase 2 Readiness Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 11 / 21
ONSEN (Phase 2) Shelf in Tsukuba B3: Damage ONSEN) was damaged/warped during shipment to KEK Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 12 / 21 ◮ Chassis of the ONSEN Prototype ATCA Shelf (planned for Phase 2
ONSEN 19-Inch Rack in E-Hut enough clearance to accept the deformed shelf Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 13 / 21 ◮ The 19-inch racks foreseen for ONSEN in the E-hut don’t have
ONSEN Phase 2 Preparation: Outlook more from January 2018 data taking provided by JENNIFER Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 14 / 21 ◮ KEK ONSEN shelf must be replaced ◮ Buy a new shelf or ◮ Send replacement from Gießen (2-slot shelf with RTM-slots suffjcient) ◮ DAQ group ofgered to buy a shelf for R&D and lend it to ONSEN during Phase 2 → will be discussed at B2GM ◮ Two ONSEN experts will be at KEK for one month from September, ◮ Support for Onsen team at KEK during phase 2 and VXD vosmic
Compute Node Upgrade for PANDA (Design and Production by the IHEP Beijing Trig Lab) Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 15 / 21
Compute Node Upgrade: Carrier Board CPU 2760 BRAM 4 Mb 5 Mb 38 Mb MGT PPC405 DSP Slices PPC440 - light-weight option like IPbus Thomas Geßler (JLU Gießen) ONSEN Lab Tests and Development 21st DEPFET Ws., May 2017 128 128 16 / 21 Virtex-5 FX70T Kintex UltraScale 060 (CNCB) (xFP) (Upgrade) xFP) Registers 50k 44k 663k LUTs Virtex-4 FX60 ◮ First stage: upgrade CNCB (but remain compatible with current ◮ FPGA: Change to Xilinx UltraScale architecture 50k × 4-input 44k × 6-input 332k × 6-input 16 × 6.5 Gbps 16 × 6.5 Gbps 32 × 16.3 Gbps ◮ No more hard-core CPU → Slow control on MicroBlaze or
Compute Node Upgrade: Carrier Board Thomas Geßler (JLU Gießen) 21st DEPFET Ws., May 2017 ONSEN Lab Tests and Development 17 / 21 Interface Flash (master BPI) ◮ RAM: 2 GiB DDR2 SODIMM → 16 GiB DDR4 (8 chips) ◮ Confjguration: Flash/CPLD (slave serial) → automatic from NOR ◮ GbE switch: 4 AMCs, 1 switch FPGA, 1 uplink to ATCA Base ◮ 16.3 Gbps MGTs ◮ 4 links to each AMC card (currently: 4 × 600 Mbps LVDS) ◮ 14 links to ATCA backplane ◮ 1 link to RTM (10G Ethernet) ◮ Programmable MGT clock ◮ Keep: ◮ JTAG chain/AMC decoupling ◮ I2C buses, sensors ◮ IPMC connector
Recommend
More recommend