ted n booth
play

Ted N. Booth DesignLinx Hardware Solutions September 2015 Using - PowerPoint PPT Presentation

Ted N. Booth DesignLinx Hardware Solutions September 2015 Using Vivado HLS for Video Algorithm Implementation for Demonstration and Validation Agenda Project Description HLS Lessons Learned Summary Project Description Create a


  1. Ted N. Booth DesignLinx Hardware Solutions September 2015

  2. Using Vivado HLS for Video Algorithm Implementation for Demonstration and Validation

  3. Agenda • Project Description • HLS Lessons Learned • Summary

  4. Project Description • Create a platform for developing and demonstrating video IP • Faster validation for large IP and/or large images • Better demonstrations for customers • Hardware based on Virtex VC709 evaluation board and a host PC • Use Ethernet to transfer frame to/from host PC • Use Video DMAs to move data between IP & external memory • Use MicroBlaze processor to initialize and control the system • Vivado HLS for IP development • Faster conversion from C/C++ to IP • Support different implementations with same code

  5. HLS Architecture Options • Supports different implementations with the same code Real-Time ( II = 1) Minimum Resources ( II = 3) (3 elements/clock cycle) (3 elements/3 clock cycles)

  6. What is Initiation Interval ( II )? Number of clock cycles ROW_LOOP : for (row = 0; row < rows; row++) { COL_LOOP : for (col = 0; col < cols; col++) { before the function can #pragma HLS PIPELINE II=? // 1 or 3 accept new input data RGB_LOOP : for (rgb = 0; rgb < 3; rgb++) { out[row][col][rgb] = in[row][col][rgb ] …

  7. HLS Lessons Learned Use optimized HLS libraries • OpenCV – Open Source Computer Vision • HLS Video – HLS video infrastructure • HLS Math – C/C++ math libraries (fixed and float) • HLS IP – Xilinx IP such as FFT and FIR • HLS Linear Algebra – Common functions • HLS DSP – Common DSP functions for SDR

  8. HLS Lessons Learned Create a library of optimized base modules • Optimize small modules to build larger modules • Use #define and function parameters • MAX_WIDTH and MAX_HEIGHT define hardware resources • Rows and columns define the size of the image #define MAX_WIDTH 1920 #define MAX_HEIGHT 1080 Void my_image (…, int Rows, int Cols) { … }

  9. HLS Lessons Learned Learn to read the synthesis reports • Reports provide estimated utilization, timing, and latency • Reports are hierarchical so you can inspect lower level functions • Loops with variable indexes will lead to undetermined latency values (?) • Asserts or TripCount directives can be added to the code to bound loops that have variable indexes

  10. HLS Lessons Learned Using Asserts to define Loop boundaries • Affects latency calculations and hardware generation #define MAX_WIDTH 1920 #define MAX_HEIGHT 1080 Void my_image (…, int Rows, int Cols) { … assert(Rows<=MAX_HEIGHT); assert(Cols<=MAX_WIDTH); ROW_LOOP: for(j=0; j<Rows; j++) { COL_LOOP: for(i=0; i<Cols; i++) { … } } }

  11. HLS Lessons Learned Use streaming data in and out of an HLS IP • Use DMAs to move data on/off chip • Let HLS implement AXI4 interfaces void my_filter(hls::stream<ap_axiu<16,1,1,1> >& In, hls::stream<ap_axiu<16,1,1,1> >& Out, int Rows, int Cols) { // Specify AXI4-Stream connections #pragma HLS INTERFACE axis port=In bundle=INPUT_STREAM #pragma HLS INTERFACE axis port=Out bundle=OUTPUT_STREAM // Group all other ports into an AXI4-Lite interface #pragma HLS INTERFACE s_axilite register port=Rows bundle=Ctrl #pragma HLS INTERFACE s_axilite register port=Cols bundle=Ctrl #pragma HLS INTERFACE s_axilite port=return bundle=Ctrl … }

  12. HLS Lessons Learned Specify bit widths for variables • Sizing variables defines limits for HLS synthesis • Support for signed and unsigned data • ap_[u]int<N> for integers • N specifies the number of bits • ap_[u]fixed<W,I,Q,O,N> for fixed-point • Parameters specify the number of bits, decimal point, quantization mode and overflow behavior • HLS aligns the decimal point during calculations

  13. HLS Lessons Learned Using “ int ” vs. “ ap_int ” for multiplication void my_int_mult (int in1, void my_ap_mult (ap_int<10> in1, int in2, ap_int<10> in2, int &out) { ap_int<20> &out) { out = in1 * in2; out = in1 * in2; } }

  14. HLS Lessons Learned Using “ int ” vs. “ ap_int ” for summation void int_sum (int in1, int in2, void ap_sum (ap_int<8> in1, ap_int<9> in2, int in3, int in4, ap_int<10> in3, ap_int<11> in4, int &out) { ap_int<13> &out) { int temp1,temp2; int temp1, temp2; temp1 = in1 + in2; temp1 = in1 + in2; temp2 = in3 + in4; temp2 = in3 + in4; out = temp1 + temp2; out = temp1 + temp2; } }

  15. HLS Lessons Learned Small code changes can affect synthesis void my_code1 (…, ap_uint<9> &out) { void my_code2 (…, ap_uint<9> &out) { … … out = 0; ap_uint<9> cnt = 0; MY_LOOP1 : for(i=0; i<127; i++) { MY_LOOP2 : for(i=0; i<127; i++) { #pragma HLS PIPELINE II=2 #pragma HLS PIPELINE II=2 sum = my_array[i] + my_array[i+1]; sum = my_array[i] + my_array[i+1]; if (sum > thresh) { out++; } if (sum > thresh) { cnt++; } } } … out = cnt; …

  16. Summary Achieving optimal results from Vivado HLS often requires tuning of the C/C++ code • Addition of HLS Directives • Use of optimized libraries • Re-architecting to use a modular approach • Use of HLS defined datatypes – ap_int and ap_fixed • Reorganizing code can affect synthesis results

  17. About DesignLinx • Veteran Owned Business Offering FPGA Design & Support Services • On-Demand Onsite FPGA Support • Support for small and large-scale projects, enabling increased bandwidth for your team • Xilinx Certified Alliance Program Member and Fidus Systems Partner • Senior design team with expertise in: • FPGA design • Verilog, VHDL and HLS • IP integration • Video Processing, DSP • DDR3/4, LPDDR2, high-speed transceivers, and more • Embedded hardware/software • SMP, AMP, Linux, vxWorks, FreeRTOS, and more • C, C++ • ASIC-level verification using ModelSim and System Verilog

  18. About Fidus • High-speed, high-complexity design • High-speed communications, high-resolution video, high-performance computing • Original products that enable Xilinx IP and custom services • Xilinx Premier Design Services member • Senior team with expertise in: • Hardware design, including digital, RF, analog, and PCB layout • FPGA design, including IP integration, signal processing, DDR3/4, high-speed transceivers, and more • Embedded software, including Zynq/MPSoC, and MicroBlaze • Signal integrity (board and system level)

  19. Questions? Ted Booth DesignLinx Hardware Solutions tbooth@designlinxhs.com

Recommend


More recommend