thoughts on alternate dune daq design
play

Thoughts on alternate DUNE DAQ design Georgia Karagiorgi DUNE DAQ - PowerPoint PPT Presentation

Thoughts on alternate DUNE DAQ design Georgia Karagiorgi DUNE DAQ Meeting Oct. 16, 2017 Introduction What is presented in this talk is a conceptual DAQ design & architecture for the DUNE FD ; I will advocate it should be explored


  1. Thoughts on alternate DUNE DAQ design Georgia Karagiorgi DUNE DAQ Meeting Oct. 16, 2017

  2. Introduction What is presented in this talk is a conceptual DAQ design & architecture for the DUNE FD ; • I will advocate it should be explored further, and evaluated against DUNE DAQ requirements, as something which would be advantageous to move toward to. Concept: Online (real-time) image processing and data selection. • Event ID “on the fly”, minimizing offline processing and reconstruction needs – Could potentially require much more minimal processing needs in front-end DAQ – Design being explored leverages advancements in Deep Neural Networks and their • applications , and is informed and motivated by Columbia Nevis Labs experience with MicroBooNE readout/DAQ system – Deep Learning development and results by MicroBooNE & DUNE collaborations – Recent work done in collaboration with L. Carloni’s Group (Columbia Comp. Sci.) – For this talk: explore design from the perspective of the single-phase detector; • expectation is that the same design is equivalently applicable to dual-phase as well 2

  3. DUNE DAQ System Parameters: Data Rates Consider a single 10kton module: • No data reduction Continuous readout rate: – 150 APA x 2,560 ch x 2 MHz x 1.5 B = 1.1 TB/s Single, localized event size (~size of an APA x 1 drift): – 2,560 ch x 2 MHz x 1.5 B x 2.25 ms = 17.3 MB Single, extended event size (all APAs x 2.4 drifts): – 150 APA x 2,560 ch x 2 MHz x 1.5 B x 5.4 ms = 6.22 GB APA 1 APA 1 APA N APA N localized event extended event 3

  4. DUNE DAQ System Parameters: Data Rates Consider a single 10kton module: • With data reduction (e.g. factor of 500-1800, depending on noise and radiological backgrounds, see docdb-4481) Continuous readout rate: – 150 APA x 2,560 ch x 2 MHz x 1.5 B / (500-1800) = 0.6-2.2 GB/s Single, localized event size (~size of an APA x 1 drift): – 2,560 ch x 2 MHz x 1.5 B x 2.25 ms / (500-1800) = 10-35 kB Single, extended event size (all APAs x 2.4 drifts): – 150 APA x 2,560 ch x 2 MHz x 1.5 B x 5.4 ms / (500-1800) = 3.5-12 MB APA 1 APA 1 APA N APA N localized event extended event 4

  5. DUNE DAQ System Parameters: Data Rates Consider a single 10kton module: • With data reduction (e.g. factor of 500-1800, depending on noise and radiological backgrounds, see docdb-4481) Continuous readout rate: – 150 APA x 2,560 ch x 2 MHz x 1.5 B / (500-1800) = 0.6-2.2 GB/s Single, localized event size (~size of an APA x 1 drift): – 2,560 ch x 2 MHz x 1.5 B x 2.25 ms / (500-1800) = 10-35 kB Single, extended event size (all APAs x 2.4 drifts): – 150 APA x 2,560 ch x 2 MHz x 1.5 B x 5.4 ms / (500-1800) = 3.5-12 MB Note: A system which rests on noise assumptions APA 1 APA 1 and assumed data reduction factors is risk-prone APA N APA N localized event extended event 5

  6. DUNE DAQ: Rethinking our challenge DUNE is a 3D imaging device • Raw data format is ideally suited for deep learning based image • processing techniques Promising performance for powerful image processing and classification • E.g. performance with offline (GPU) • training and inference: DUNE SP FD simulations [J. Hewes] VGG16 CNN trained to isolate n-nbar oscillation events from atmospheric neutrino backgrounds (more in Jeremy’s thesis) Excellent separation MicroBooNE: CNNs successful in identification and between atm. nu and differentiation among different particle types. n-nbar signal images [JINST 12, P03011 (2017)] Detection Accuracy (%) Most Frequent MisID (%) 6

  7. New DAQ philosophy Real-time image processing utilizing Deep Neural Networks (ideally on FPGA) • Minimize disk buffering needs (more on back-end) • Minimize reconstruction needs • Minimize reliance on noise rates (trainability) • Concerns • 1. Speed of inference (per “image”); can we keep up with rates if we want to process every drift window (necessary for SN)? 2. Reliability of inference ; already know MC-only training is deficient; how can we train reliably? Rare event searches often have no “control” data samples. 3. Changing detector conditions and need for retraining ; what features are DNNs most sensitive to? what retraining frequency? what resources does this require? 4. How do we practically (re)train on data in real time? 5. Cost, technology lifecycle, power consumption, lifetime, … Studies are needed to address above concerns and demonstrate the feasibility of a • DNN-based readout & DAQ scheme early on. 7

  8. A “stab” at a conceptual Reco. class. A (high-E nu) DAQ design: external/beam/photodet. triggers Reco. Class. B (e.g. CRM) DNN Active “frame” “traditional” fpga Active “frame” Signal selection ROI finding Reco. Class. Active “frame” selection ROI finding processing (e.g. min. (cropping C ROI finding selection (e.g. min. (cropping 0 integr. charge (e.g. p->Knu) in channel (cropping (e.g. min. integr. charge in channel AND/OR ROI image and time in channel integr. charge AND/OR and time external/beam classif. (A/ space) and time AND/OR external/beam space) trigger) B/C/D/...); Reco. Class. channel space) external D trigger) per plane, Cold data Noise triggers) (e.g. n-nbar) or all 3 elec. regrouping filtering Signal planes “slice” (e.g. coh. (e.g. by processing localized events noise (e.g. 1 wire plane, 1 Reco. Class. removal) APA) by APA … volume) Full frame classif. (SN); per plane, Signal or all 3 processing extended events planes Reco. Class. X 2 (SN) data pre-processing data pre-selection/selection disk writing 8

  9. A “stab” at a conceptual Reco. class. A (high-E nu) DAQ design: external/beam/photodet. triggers Reco. Class. B (e.g. CRM) DNN Active “frame” “traditional” fpga Active “frame” Signal selection ROI finding Reco. Class. Active “frame” selection ROI finding processing (e.g. min. (cropping C ROI finding selection (e.g. min. (cropping 0 integr. charge (e.g. p->Knu) in channel (cropping (e.g. min. integr. charge in channel AND/OR ROI image and time in channel integr. charge AND/OR and time external/beam classif. (A/ space) and time AND/OR external/beam space) trigger) B/C/D/...); Reco. Class. channel space) external D trigger) per plane, Cold data Noise triggers) (e.g. n-nbar) or all 3 elec. regrouping filtering Signal planes “slice” (e.g. coh. (e.g. by processing localized events noise (e.g. 1 wire plane, 1 Reco. Class. removal) APA) by APA … volume) Full frame classif. (SN); per plane, Signal or all 3 processing extended events planes Reco. Class. X 2 (SN) Raw data (channel, ADC, TDC) flows from left to right, organized serially, in “frames” . a frame is O(1) drift, and, e.g. 1 APA wire plane (well defined boundary). Every single frame processed down to this stage (at least); then, can optionally drop frames. 9

  10. A “stab” at a conceptual Reco. class. A (high-E nu) DAQ design: external/beam/photodet. triggers Reco. Class. B (e.g. CRM) DNN Active “frame” “traditional” fpga Active “frame” Signal selection ROI finding Reco. Class. Active “frame” selection ROI finding processing (e.g. min. (cropping C ROI finding selection (e.g. min. (cropping 0 integr. charge (e.g. p->Knu) in channel (cropping (e.g. min. integr. charge in channel AND/OR ROI image and time in channel integr. charge AND/OR and time external/beam classif. (A/ space) and time AND/OR external/beam space) trigger) B/C/D/...); Reco. Class. channel space) external D trigger) per plane, Cold data Noise triggers) (e.g. n-nbar) or all 3 elec. regrouping filtering Signal planes “slice” (e.g. coh. (e.g. by processing localized events noise (e.g. 1 wire plane, 1 Reco. Class. removal) APA) by APA … volume) Full frame classif. (SN); per plane, Signal or all 3 processing extended events planes Reco. Class. X 2 (SN) What is this layer? Layer developed and optimized for application of DNN for both image selection and classification • Can be a combination of FPGA and GPU devices: • GPU: acceleration of training • FPGA: acceleration of inference • During normal operations, DNN implemented in FPGA select and classify frames/ROIs of interest. • GPU allows semi-offline (re)training, and adjusting to changing detector conditions. • After this layer, images are already classified; specialized, topology-targeted reconstruction can be applied • separately on each event class 10

Recommend


More recommend