The Correlator for the Tianlai Experiment Jie Hao National ASIC Design Engineering Center, Institute of Automation, Chinese Academy of Sciences Email: jie.hao@ia.ac.cn
Outline q System Design q Hardware Design q Algorithm Design q Control Software Design q Questions and Solutions
System Design 250MSPS,14bits PFB& AD trigger FFT Correlator 10G (DSP Array) Ethernet PFB& AD 504Ports 32bits FFT 8bits 192 inputs FPGA SATA SRIO Switch Array PFB& AD FFT Correlator (DSP Array) PFB& AD FFT FPGA clock AD/FFT data 10M to 250M Control PC GPS 10M
Data Flow Correlator 2K points AD 14bits 8bits Float point( IEEE 754 std. )/ 32bits FFT Fixed point FPGA DSP
Tianlai Digital Backend
Hardware Design q Data Sampling(ADC+FFT) -ADC based on FMC( FPGA Mezzanine Card ) carrier board -FFT based on FPGA(Virtex-6) board -Rear board q Data processing (Correlator) -based on DSP(TMS320C6678) q Data switch -RapidIO switch q Data storage -10G Ethernet -SATA Array
GD2FPGA Data Flow FMC(HPC) FMC(HPC) USB QSFP GTX(X4)/20Gbps clock trigger GTX(X4)/20Gbps GTX(X4)/20Gbps GTX(X4)/20Gbps DDR3 QDR DDR3 QDR POWER GROUP GTX(X4)(550T) FPGA1 GTX(X4)/20Gbps FPGA2 LVDS POWER GTX(X8)(only for GTX(X16)/80Gbps GTX(X16)80Gbps FPGA0 LVDS 550T/475T) 550T/475T Virtex5 only for (50T) PCIex1 10/100/1000M Ethernet XP1 XP2 XP3 XP4 XP5 XP6 XP7 XP8 XP9
GD2FPGA board Data Sampling(ADC+FFT) FPGA (Virtex5) FPGA(Virtex6) FPGA(Virtex6)
Reconfigurable mother board
GD2FPGA Board Data processing board+ FMC carrier board Features • 6U standard size • 2xVirtex-6 FPGA (XC6VLX240T-2FFG1759/XC6VLX315T-2FFG1759 XC6VLX550T-2FFG1759/XC6VLX475T-2FFG1759) • 16G DDR3-SDRAM • 288M QDR • 2xFMC (HPC) Expansion Slots • 1xUSB2.0 • 2xPCIe • 10/100/1000M Ethernet • 2xGTX—40Gbps(RapidIO) through frontplane(QSFP connector) • 8x GTX(X4) through backplane(ZD connector)— 160Gbps(XC6VLX240T-2FFG1759/XC6VLX315T-2FFG1759) • or 12xGTX(X4) through backplane(ZD connector)— 240Gbps(XC6VLX550T-2FFG1759/XC6VLX475T-2FFG1759) • Soft IP(high speed interface) -Rapid IO -10G Ethernet
Roach2 FPGA Power PC 4XSFP+ GD2FPGA Roach2
Rear board l Connecting boards through cable(Molex) l Signal testing 10Gb Ethernet rear board l 4xBCM8747 l 12xSfp+
FMC card (ADC) • 8 channels • 4xADS62P49 • External trigger&clock • 250 MSPS, 14-bits A/D • Supports multiple clock • Versatile and industry- standard VITA 57.1 FMC
Data processing (correlator) RJ45x2 88E1111 88E1111 MDI DSP8 DSP7 DSP2 DSP1 POWER GROUP SGMII Swtich DSP5 DSP6 DSP4 DSP3 R E W O P FPGA S6 SRIO(X4)/20Gbps PCIe(x1) XP1 XP2 XP3 XP4 XP5 XP6 XP7 XP8 XP9
Data processing board GD8DSP ( based on DSP)
GD8DSP Board Data processing board Features • 6U standard size • 8xTMS320C6678 • 16G DDR3-SDRAM • 4xRapidIO(X4) through backplane(ZD connector)—80Gbps • 8xPCIe(X2) through backplane(ZD connector)—80Gbps • 2560GMAC (Multiply and Accumulate)
GDSRIOSW board Data switch board Front plane Rear plane
GDSRIOSW board Features • 6U standard size • 4xIDT CPS1848(5~6.25Gbps/lane) serial RapidIO switches • Supports Rapid IO 1.0,2.0,2.1 • 8xQSFP—160Gbps • 28xRapidIO(X4) through backplane(ZD connector)— 560Gbps • Supports backplane star topology
RapidIO vs. 10G Ethernet Virtex6’s GTX transceivers: up to 6.6 Gb/s 5Gbps 2.5Gbps GTX GTX 5Gbps 2.5Gbps GTX GTX 10GE FPGA FPGA 5Gbps 2.5Gbps GTX GTX PHY 5Gbps 2.5Gbps GTX GTX RapidIO 10 GE
The rear board of GDSRIOSW board GDSRIOSW board & rear board 28xRapidIO(X4)
Switching System(fat tree) ... Pod1 pod8 128 servers
Switching System B15 A15 D15 C15 B35 A35 D35 C35 A11 A13 B12 B13 B16 A17 C11 B18 C13 C14 D13 C16 D17 C17 A31 A33 B32 B33 B36 A37 C31 B38 C33 C34 D33 C36 D37 C37 D11 D31 A12 B11 A14 B14 A16 B17 A18 C12 D12 D14 D16 D18 C18 A32 B31 A34 B34 A36 B37 A38 C32 D32 D34 D36 D38 C38 B45 A45 D45 C45 B25 A25 D25 C25 A21 A23 B22 B23 B26 A27 C21 B28 C23 C24 D23 C26 D27 C27 A41 A43 B42 B43 B46 A47 C41 B48 C43 C44 D43 C46 D47 C47 A22 B21 A24 B24 A26 B27 A28 C22 D21 D22 D24 D26 D28 C28 A42 B41 A44 B44 A46 B47 A48 C42 D41 D42 D44 D46 D48 C48 A35 C35 A37 C37 A38 C38 A31 C31 A32 C32 A33 C33 A34 C34 A36 C36 A11 A41 C11 C41 A12 A42 C12 C42 A13 A43 C13 C43 A14 A44 C14 C44 A15 A45 C15 C45 A16 A46 C16 C46 A17 A47 C17 C47 A18 A48 C18 C48 A21 C21 A22 C22 A23 C23 A24 C24 A25 C25 A26 C26 A27 C27 A28 C28 B11 B31 D11 D31 B12 B32 D12 D32 B13 B33 D13 D33 B14 B34 D14 D34 B15 B35 D15 D35 B16 B36 D16 D36 B17 B37 D17 D37 B18 B38 D18 D38 B21 B41 D21 D41 B22 B42 D22 D42 B23 B43 D23 D43 B24 B44 D24 D44 B25 B45 D25 D45 B26 B46 D26 D46 B27 B47 D27 D47 B28 B48 D28 D48 Pod 1 Pod 2 Pod 3 Pod 8 Pod 7 Pod 6 Pod 5 Pod 4
High Speed Interface Testing 1.25Gbps 3.125Gbps 5Gbps 2.5Gbps
Data storage q SATA Array+10G Ethernet card q parallel processing q >3.2Gbps
Algorithm Design q Algorithm on FPGA q Algorithm on DSP q Interface program q Switch schedule
Based on Simulink Development Environment Correlator FFT ADC PFB
Correlation The Fourier domain cross-correlation is : Plan A:Full Correlation Plan B:1D DFT correlation
System Control software q System bootload ü Flash ü Ethernet q Control instruction ü System/Algorithm parameter ü Control word format q Information Display(Eclipse) ü Original AD data display ü Board information collection ü High speed interface information display ü Board temperature information display ü Power voltage and Current information display q Remote Control
Control and Display Interface System bootload control High speed Interface Information
Control and Display Interface Original AD data display
Control and Display Interface Power voltage and Current information Board temperature information
Questions and Solutions “Noise”
How does it work? MSMCSRAM One piece of correlation results (INT) Block Block Block 0 1 N-1 L1DRAM loop0 loop1 loopN-1 Block (INT) Inverse int to float and EDMA accumulate it to the float Block (FLOAT) loop0 loop1 loopN-1 Block Block Block 0 1 N-1 EDMA DDR Accumulations (FLOAT) loop0 loop1 loopN-1 accumulation
How did that happen Since we use the EDMA for the data transfer, it's not controlled by the cores. If the data in loop n flushed in before the results moved out in loop n-1, the final results in loop n-1 could be covered by the data in loop n, resulting in the “Noises”. Solution In order to avoid this, we control the time gaps between the loops next to each other by inserting a proper time delay.
Thank You!
Recommend
More recommend