Home design with BNL Immediate plans for DUNE-UK hardware effort Giles Barr Oxford: UK DUNE DAQ project kickoff meeting 17-18 October 2019 7
Home design with BNL • Summary: • Collaborating on a Versal design with BNL is very interesting indeed • BNL less interested in a non-Versal design • UK has some open questions with Versal still (such as: when technical info and reference designs become available, cost) • Plan: • We will proceed with system design studies of both a Zynq and Versal design in parallel until more Versal info is available • To get serious commitment and detailed technical info from potential board manufacturers and from our studies, we will proceed with the Zynq design in 'full detail' – i.e. sufficient to actually build the board, and will stop this when the Versal open questions are clear. [*] [*] The item in blue is not to indicate a preference for non-Versal; on the contrary, the advantage is we don't spend much extra effort by elevating the feasibility study to an actual design on the Zynq (where we have sufficient technical details, reference designs etc.). We thereby end up in a reduced risk situation by being in possession of a practical design, and we have firmer knowledge of UK board manufacturer 8 capability, power distribution, and other things that are similar between Zynq and Versal.
Home design with BNL (2) • Overall goal is to end up with a board that can do all the things we need. Designed in such a way that ultrascale+ firmware can be ported easily to it. • Versal will do this • (and Zynq if it continues to be a backup solution) • COTS boards will do this • Use COTS ultrascale+ boards to develop firmware and software vigorously in parallel with hardware development. • We should have a bit of brainstorming to see how to do this (either now, or after DAQ sprint) 9
Does the board 'do all the things we need'? • Must present this question in upstream-DAQ meetings to gain consensus that what we intend to build indeed does what we need. • Will attempt to practice those arguments here today: • Requirements: • Board capable of copying all data into host-memory with FELIX firmware and software on PCIe bus. • Board capable of providing hit-finding etc. in firmware. • DRAM memory and SSD buffering on board for trigger latency functions. • Maximum flexibility at design-start (now) to later tune things: • Size of FPGA, number of boards, memory bandwidth, whether memory feature is included at all, etc. 10
Essential Requirements at Conceptual Level 1. Online trigger and data selection 2. Pre-trigger RING buffer up to e.g. 10s, 3. Post-trigger 100s FIFO buffer 4. DAQ up time >99% + minimize maintenance cost over 20 years 5. Reserve engineering margins for running additional and unknown processes in the future (new ideas). 6. Project restrictions: Some of R&D and construction (hardware) should be in UK. 7. Compatible with FELIX firmware suite. (desirable) 11
The discussion focusses around these five approaches A. ZU19 – preferred next board choice in Oxford [Want to change this to ‘in UK’ when more discussion]. B. ZU15(or 6 or 9) – Temporarily deprecated, but included for comparison. C. KU15P – like ZU19 but no processor. D. Versal – like ZU19, but one generation more modern. E. Stop hardware development now - commercial boards will be available (e.g. Bittware VU9P) 12
Layout considerations Take FPGA-only raw cost from UK proposal and see how many chips can be purchased, maximize SystemLogicCells (SysLog) per APA and BRAM per APA for this cost. Have scaled our quantity/education discount from quote for UK grant request 2018, not included latest devaluations of £ L . Arranged in tables below: Major columns are how many FPGAs per APA 13
Layout considerations BRAM+URAM (previous slide was BRAM only) 14
Layout considerations This slide is to show off that we did a lot of configurations, different sockets etc. 15
Architecture considerations Objective is to find combinations of Versal good for 20-links per card, too Xilinx giving a lot of flexibility and expensive for 10-, 7-, 5- or 4-links. So explore the cost optimum. not much good for flexibility in 2026 unless price drops. • Now (2019): We must choose the socket type now, and make sure Zynq is opposite: Good for 10-, 7-, 5- there are enough links and or 4-links, but under-resourced for bandwidth to handle the sensible 20-links per card. configurations. Zynq is overall cheaper (always the • When we order bulk (2022): case that older is cheaper, until it Choose the Xilinx chip that fits into goes out of production, which will our socket – handle on the amount probably be sooner). Simpler card to of logic and memory per link. design. • When we turn on the detector Kintex is same as Zynq but no (2026): Choose how many cards processor per APA are used, to fine-tune the logic and bandwidth. All these options are good, so we pick from a number of good ones. 16
Conclusions • Some choices on the hardware need making early (compared with firmware and software, where the reverse gear is easier to find) • It is not really that controversial, we have optimized and found two or three really good choices • Important to now convince upstream DAQ group (and data selection group) we have selected the optimum • Remaining choice is a very common one in DAQ: Do we go with the new shiny generation, that is more expensive, ATLAS like it, but leaves • us less development time. Or go for the one we know, can get a first version quickly – best if we want to do long term • reliability tests (this is what attracts me to it), but will go out of production sooner. Pleasingly, COTS options exist Will study both these as more info about Versal emerges, will do the Zynq with intention of building it – gives us hardware in pocket well within schedule. • All these should be straightforward to port a standard FELIX/DUNE-trig suite to. • Use evaluation boards to develop firmware in parallel with hardware Evaluation boards also serve as a fallback solution • 17
Backup 18
Architecture considerations The following four architectures for Here is how they match up DUNE have been discussed widely with the five approaches on previous slide Option 1 = FELIX-CPU only, no memory or SSD on HW board. A B C D E Option 2 = FELIX with memory and SSD but no processor. Can run the planned 1 Y N Y Y Y IPBus-based firmware and FELIX firmware. 2 Y N Y Y Y Option 3 = As option 2 but with CPU in FPGA. Allows CPU-assisted firmware 3 Y N N Y N architecture. As fallback (not proposing 4 Y Y N Y N development under option 3) can send PCIe traffic over Ethernet from CPU instead. A (ZU19) and D (Versal) are Option 4 = As option 3 but no FELIX. Not the most flexible. currently proposing this, but included for C (KU15) and E (VU9P) have no cost comparison. processor so exclude option 3. B is not as flexible. 19
Recommend
More recommend