The HL-LHC CMS Level-1 Track Trigger Luis Ardila INSTITUTE FOR DATA PROCESSING AND ELECTRONICS (IPE) www.kit.edu KIT – The Research University in the Helmholtz Association
A TRACKER BUILD FOR TRACKING p T discrimination provided by use of special modules Pairs of closely spaced silicon sensors, – separated 1-4 mm Signals from each sensor are correlated – Only hit pairs compatible with p T > ~2GeV/c – (“Stubs”) are forwarded off-detector Factor ~10 data reduction – 2 Institute for Data Processing and Electronics (IPE)
TRACKER → TRIGGER DATA FLOW L1 hardware trigger reduces event rate from 40 MHz to <750 kHz using Transmission of stubs to BE electronics 1 μs calorimeter, muon and tracker primitives Correlation of trigger primitives (inc. tracks) 3.5 μs TK primitives are all tracks (pT > 2-3 GeV/c) from Outer Tracker Broadcast of L1-Accept to FE buffers 1 μs – Safety Margin 3 μs L1-Accept triggers all front-end buffers to read out to DAQ → HLT farm – → Track finding from stubs FE L1 latency buffers limited to 12.5 μs must be performed in 4 μs 3 Institute for Data Processing and Electronics (IPE)
TRACK FINDER ARCHITECTURE DTC nonant 1 : Outer Tracker cabled into nonants z+, z- (24 DTCs) Time-multiplexed Use of time-multiplexing to increase Processing slice 216 parallelization DTC boards TFP DTC nonant 2 : TFP TFP Time-multiplexing directs data TFP – z+, z- (24 DTCs) TFP TFP TFP TFP 144 from multiple sources to a TFP single processing node boards N time slices per M regions e.g. 6 time slices 1 event per processing node – x 24 regions Processors are independent entities Two stages of data processing → simplifies commissioning and operation DAQ, Trigger and Control (DTC) layer – Spare nodes available for Track Finding Processor (TFP) layer – redundancy All-FPGA processing system – ATCA form factor; CMS standard dual-star backplane – 4 Institute for Data Processing and Electronics (IPE)
TRACK FINDER ARCHITECTURE – DTC PCIe x 1 DAQ x 2 DTC card must handle @5 Gbps @25.6 Gbps DTC nonant 1 : z+, z- (24 DTCs) <=72 modules (5G/10G – 36 lpGBT opto-links) @5.12/10.24 Gbps Time-multiplexed 3 x 12 24 RX Processing slice KU15P Control/Readout for each @25.6 Gbps – 216 6 x 4 A1760 DTC module 36 RX/TX @2.56 Gbps 44 GTH boards TFP DTC nonant 2 : TFP 3 x 12 32 GTY TFP TFP z+, z- (24 DTCs) TFP TX TFP Direct L1 stream to TFP – TFP 144 TFP central DAQ (16G/25G) boards N time slices per M regions Direct stub stream to – e.g. 6 time slices 36 TFPs (16G/25G) x 24 regions @5.12/10.24 Gbps 3 x 12 24 RX KU15P @25.6 Gbps Stub pre-processing includes: 6 x 4 A1760 36 RX/TX @2.56 Gbps 44 GTH Two stages of data processing 3 x 12 32 GTY Local→ Global look up, – TX position calibration DAQ, Trigger and Control (DTC) layer – Sort and pre-duplication Track Finding Processor (TFP) layer – – PCIe x 1 DAQ x 2 All-FPGA processing system – Time-multiplexing @5 Gbps @25.6 Gbps – ATCA form factor; CMS standard dual- – → 216 DTC boards, 18 shelves, 1 rack/nonant star backplane 5 Institute for Data Processing and Electronics (IPE)
TRACK FINDER ARCHITECTURE – TFP PCIe x 1 TFP card must handle @5 Gbps DTC nonant 1 : z+, z- (24 DTCs) Up to 72 DTCs – (16G/25G optical links) Time-multiplexed 72 @16/25 Gbps VUxP Processing slice Track Finding from stubs 216 – KU115 6 x 12 DTC RX D1517 TFP boards DTC nonant 2 : TFP Track Fitting 64 GTH TFP – TFP z+, z- (24 DTCs) TFP TFP TFP TFP 144 TFP Transmission to L1 – boards 72 Correlator Trigger @16/25 Gbps N time slices per M regions e.g. 6 time slices x 24 regions High bandwidth processing 2 card VUxP @16/25 Gbps KU115 1 x 12 D1517 TX ~1 Tb/s processing – Two stages of data processing 64 GTH bandwidth DAQ, Trigger and Control (DTC) layer – Rate to L1 Correlator – Track Finding Processor (TFP) layer – much lower < 30 Gb/s PCIe x 1 All-FPGA processing system – @5 Gbps ATCA form factor; CMS standard dual- – → 144 TF boards, 12-18 shelves star backplane 6 Institute for Data Processing and Electronics (IPE)
TRACK FINDING ALGORITHMS Check Poster on Tuesday Two main algorithms for reconstructing tracks, plus a number of hybrids, variation and options TRACKLET APPROACH HOUGH TRANSFORM + KALMAN FILTER APPROACH Combinatorial approach using pairs of stubs as seeds Uses a Hough Transform to detect coarse candidates – – Extrapolation to other layers → hit matching – Candidates are filtered and fitted in a single – subsequent step using a Kalman Filter Linearized χ2 fit on candidates – Combinatorial problem pushed to latter stages of – Uses full resolution stubs at earliest stage of – processing processing N time-slices x M regions → 18 x 9 – N time-slices x M regions → 6 x 24, 9 x 18 – 7 Institute for Data Processing and Electronics (IPE)
HARDWARE DEMONSTRATORS Both Demonstrators Demonstrator in hardware, verified using were tested with emulation software samples from PU 0→ 200 Hardware demonstrator has been built to validate the algorithm and measure latency Demonstrator in hardware and emulation 4 CTP7 boards with Virtex-7 FPGA – 3 CTP7 cover 3 Ф One per time multiplexing and detector nonant – – sectors – 1 CTP7 emulate DTC Each box is one MP7 board with Virtex-7 FPGA – 1 AMC13 card for clock and synchronization – Can compare hardware output directly with software – 240 MHz internal fabric speed – 240 MHz internal fabric speed – Measured latency of 3.33 μs in agreement with latency – model Latency verified to be 3.5 μs – 8 Institute for Data Processing and Electronics (IPE)
HYBRID ALGORITHMS Efforts have started to merge the two approaches Working on defining a reference algorithm – 9 Institute for Data Processing and Electronics (IPE)
Bristol University, Imperial R&D College, Ioannina, INFN, KIT, RAL, SACLAY, TIFR ATCA infrastructure Systematic thermal studies about air x-section and – impact on opto-lifetime COM Express Backplane signal integrity → important for – DAQ/timing Use of interposer technology Samtec Firefly Flexibility (e.g. FPGA) – x12 RX/TX pairs Samtec Z-RAY Mitigate losses/costs due to yield issues – interposer 133 x 30 mm Modularity; separate complex and simpler part of – the board design Clock test On-board computing and control variety daughtercard CERN-IPMC Standard on-board PC (COM Express mini) – Zynq Soc – FPGAs IPMC only KU115 – KU15P PCB design practices, stackup and material VU9P Build up relationship with manufacturers – daughtercards 10 Institute for Data Processing and Electronics (IPE)
THERMAL SIMULATION AND TESTS Simulation setup Test1 (°C) 4xFan-block speed=50% PCB imported from PADS – Exhaust temp ~17°C (~amb) Power on FPGA heaters = 86 W Placed in a 33 mm deep tunnel – Power on Optics heaters = 41 W 4 m/s airflow from bottom (20 °C) to top – X0FTop = 53.7 X1FTop = 60.7 Placed components X0FBottom = 50.1 X1FBottom = 59.1 X0OFTop = 35.8 X1ORTop = 50.8 KU15P (50 W) doubled θ JB to take interposer – X0OFBottom = 28.2 X1ORBottom = 49.7 into account X0ORTop = 37.2 X1OFTop = 43.1 Firefly banks 25 G (30W) and 16 G (12 W) X0ORBottom = 31.1 X1OFBottom = 41.7 – Total power 205.4 W – Test setup Two heat-pads 45 mm x 45 mm and 12 mm – x 70 mm Just one mockup board is present, it will be – put in between two additional soon ~ 11 W for 6x block of 16 Gbps optics – ~ 10 W for 6x block of 25 Gbps optics – 11 Institute for Data Processing and Electronics (IPE)
SUMMARY L1 track trigger at HL-LHC necessary but also Efforts have started to merge the two approaches challenging Working on defining a reference algorithm – p T modules provide first layer of efficient data – Common infrastructure R&D – reduction ATCA thermal simulations and tests ● Two all-FPGA approaches: Tracklet and TMTT Slow-control and shelf manager concept ● Use high-performance FPGAs – High-speed optical link test ● Highly parallelized tracking algorithms – Data organization → pattern recognition → – track fitting → duplicate removal Both have demonstrated feasibility and – good performance 12 Institute for Data Processing and Electronics (IPE)
Recommend
More recommend