review
play

Review Numbers Formats and Simple Arithmetic FPGA Structure (CLB, - PowerPoint PPT Presentation

Review Numbers Formats and Simple Arithmetic FPGA Structure (CLB, Routing, IO, Clocks) Pipelining (Resource VS Speed VS Latency) Memories and Waveform Generation ADCs and DACs applications in DSP Constraints (Timing and


  1. Review • Numbers Formats and Simple Arithmetic • FPGA Structure (CLB, Routing, IO, Clocks) • Pipelining (Resource VS Speed VS Latency) • Memories and Waveform Generation • ADCs and DACs applications in DSP • Constraints (Timing and Placement) • More Complex Arithmetic (Series Expansion and the CORDIC algorithm for sin & cos) • DSP Resources (DSP48E2 Block) • Filtering: FIR and IIR Implementations • Serial multi-rate DSP (decimation and interpolation) and applications

  2. Looking Forward • Multi-Rate, Parallel DSP (1 week) • FFTs (2 weeks) • Digital Compensation (1 week) • PLLs • AGCs • Complete DSP Chains & SDRs (2 weeks) • Miscellaneous (1 week) • Pseudo-Random Noise Generators and CRC checks • PWM and PDM (audio systems)

  3. Parallel Processing • In some instances, the timing requirements cannot be met with a serial process even after a DSP function is fully pipelined. • Example: • In desktop computers, video processing requires the values of many pixels to be simultaneously computed within the refresh rate. • Since many of the operations are independent, GPUs are well suited to handle the computational load in a parallel fashion.

  4. Parallel Processing • FPGAs are well suited to handle parallel tasks. • We need to understand what can be computed independently, or how to modify the DSP algorithm to work in a parallel fashion. • Like pipelining there is a trade-off between use of resources and achievable clock rates. • Common Applications • FFTs • Video Processing • GSPS ADCs and DACs

  5. Parallel Processing • Many FPGAs now have dedicated hardware components to facilitate the use of high speed data converters that operate at rates that exceed the FPGA fabric. • Gigabit Transceivers • Serializers • Deserializers • RFSoC Integrated ADCs and DACs • Extreme care must be taken to understand clock rates and data formats. • ODDR processing of the DAC channels on the dev board.

  6. GigaBit Transceiver

  7. Zynq RFSoC DATA_ADC0[127:0] – 8x 16-bit samples Up to 16 Converters 128 Values per clock cycle. DATA_ADC0[255:0] – 16x 16-bit samples Up to 16 Converters 256 Values per clock cycle.

  8. Zynq RFSoC I/Q Mixers, Decimation, Interpolation all implemented in dedicated hardware.

  9. Zynq RFSoC

  10. Serializer • Part of the IO Logic • Data_In D8 to D1 • Data_Out OQ • Achieve output data rates that are up to 14x fabric rate.

  11. Detailed View • 4-to-1 • Signals • 2 clocks • Global Clock: Slower clock from FPGA fabric. • IO Clock: High-speed Input/Output clock. • SDR and DDR • IO Data • 4 input lines • 1 output line • Enables • Training Data

  12. Detailed View • Structure • Registers • Two Columns • Parallel Load • Global Clock • Shift Regs, Serialized output. • IO Clock • Muxes • Shift data from parallel load to Shift Registers. • Use Training Data. • Width Expansion

  13. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects preset training data rather than D4 to D1.

  14. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects preset training data rather than D4 to D1.

  15. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D1 preset training data rather than D4 to D1.

  16. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D2 preset training data rather than D4 to D1.

  17. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D3 preset training data rather than D4 to D1.

  18. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D4 preset training data rather than D4 to D1.

  19. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D1 preset training data rather than D4 to D1.

  20. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D2 preset training data rather than D4 to D1.

  21. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D3 preset training data rather than D4 to D1.

  22. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D4 preset training data rather than D4 to D1.

  23. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects D1 preset training data rather than D4 to D1.

  24. Detailed View • Operation • Global Clock • Loads parallel data from D4 to D1. • Strobe • Not used on OSERDESE2. • Selects mux to shift parallel data into shift registers. • I/O Clock • When Strobe is high, loads shift registers with new data. • When Strobe is low, shifts out the serial data. • Train pin selects preset training data rather than D4 to D1.

  25. Example

  26. Example: 2.5 GSPS DAC • Think about the required Clock and Data Requirements. • Device requires two deinterleaved DDR data paths. • Data Rate (per path): 2.5 GSPS/2 = 1.25 GSPS • Clock Freq (per path): 2.5 GHz/4 = 625 MHz • FPGA OSERDES IO Clock operating in DDR mode. • We will drive each data path with an 8:1 OSERDESE2. • Our choice based on the available FPGA (Host Processor in the figure)

  27. Example: 2.5 GSPS DAC FPGA Requirements DB0x14 Global Clock • Not DDR D14 D12 • 1.25 GSPS/8 D10 D8 8:1 • 156.25 MHz D6 8:1 8:1 D4 SerDes SerDes D2 SerDes D0 Every clock Cycle must update 16x 14-bit data GCLK IOCLK samples. DB1x14 reg [13:0] D [15:0] always@(posedge GCLK) D15 D13 begin D11 D9 8:1 D[15] <= ?; D7 8:1 8:1 D5 SerDes SerDes D[14] <= ?; D3 SerDes D1 ... D[0] <= ?; GCLK IOCLK end

  28. Clock Gen. and Dist. Informational Resources IOSERDES: SelectIO Users Guide BUFR & BUFIO: Clocking Users Guide Instatiation: Libraries Guide

  29. DCI Example: 2.5 GSPS DAC 0 1 0 1 FPGA Requirements DB0x14 0 8:1 1 Global Clock 0 SerDes 1 • Not DDR D14 D12 • 1.25 GSPS/8 D10 GCLK D8 IOCLK 8:1 • 156.25 MHz D6 8:1 8:1 D4 SerDes SerDes D2 SerDes D0 Every clock Cycle must update 16x 14-bit data GCLK IOCLK samples. DB1x14 reg [13:0] D [15:0] always@(posedge GCLK) D15 D13 begin D11 D9 8:1 D[15] <= ?; D7 8:1 GCLK 8:1 D5 SerDes SerDes D[14] <= ?; D3 SerDes BUFR D1 ... DIV D[0] <= ?; GCLK IOCLK IOCLK end BUF IO

  30. Waveform Generation • Waveform Generation for a serializer becomes more complicated. • Start Simple: Using the previous example, how would a linear ramp be generated?

Recommend


More recommend