SLIDE 1 Embedded vision with FPGA vs CUDA processing. Directions and platform proposal
WASC 2014 20 June 2014
ariasmo@inaoep.mx
Reconfigurable and High Performance Computing Lab INAOE – Puebla, Mexico
SLIDE 2
Content
1.
Introduction
2.
Previous work on FPGA architectures
3.
FPGA cameras
4.
Platform proposal: FPGA vs CUDA
5.
Long term project
6.
Conclusions
SLIDE 3 Reconfigurable and High Performance Computing Laboratory
l Computer Science
Department
n 4 Researchers n 10+ M.Sc. Students n 5+ Ph.D. Students
l Active since1998 l Research on:
n Real time computer vision n Criptography and Cipher n Hardware Signal
Processing
SLIDE 4
Smart camera approach High performance low level vision computing at camera 3D Vision, tracking / surveillance applications
SLIDE 5
architectures
Edge / Corner Detection Stereo disparity Target Tracking Motion correlation and Optical Flow 3D from Optical Flow SIFT / SURF / LISF feature detection
SLIDE 6 Approach
l Off-the-shelf development boards l Focus on FPGA architecture.
Application can be built in parallel
l Goal: Reach video rate processing
(i.e. 30 fps)
SLIDE 7
Edge and corner detection
Industrial applications Basis for other image processing applications
SLIDE 8 Edge and corner detection architecture
COR COR
1
COR
2
COR
3
COR
4
COR
5
COR
6
COR COR
1
COR
2
COR
3
COR
4
COR
5
COR
6
N N
1
N
2
N
3
M N
4
N
5
N
6
M U X
P O S T
R A M 2 Address Generator
Decodificador
R A M 1 N N
1
N
2
N
3
M N
4
N
5
N
6
COR
N
N
N-1
COR
1
COR
2
COR
3
COR
4
COR
5
COR
6
SLIDE 9
Demostration with RC200
SLIDE 10 Target processor
FPGA implementation
REGISDTRO (HEADER) VEN TANA P AT RÓN VEN TANA D E BÚSQUEDA
PRO CESADO R DE CORRELACIÓN
Target tracking
SLIDE 11
Multiple object tracking
SLIDE 12
Performance gain
Algorithm acceleration
25 x to 50x compared to PC computer
Drawback
Modularity and reuse Lack of standards for vision cores
SLIDE 13
Overview of : Early concept Current approaches
SLIDE 14 Smart camera architecture
Imager L
FPGA Host Computer Smart camera
High BW channel (Ethernet or USB2.0) Reconfigurable processor
- Soft processor (ctrl)
- Parallel processor
- I/O and device interfaces
One or two (stereo) imagers Memory Comm Imager R
Image and inter- mediate data buffer
High level processor (PC or robot CPU)
SLIDE 15
FPGA camera
SLIDE 16
Custom Spartan6 development board
SLIDE 17
FPGA ¡camera ¡– ¡2012/2013 ¡
USB 3.0 5 Megapixeles Spartan 6 device FPGA for sensor control and data packaging FPGA room for additional processing
SLIDE 18
FPGA ¡camera ¡prototype
SLIDE 19
Current work FPGA/Arm platform + Camera Tegra K1 platform + Camera
SLIDE 20
4.1 FPGA based Proposal
l Use of a SoC (System on a Chip) l FPGA + ARM processor + Embedded
Linux
l Xilinx Zynq7000 + support electronics l Reconfiguration + I/O flexibility
SLIDE 21
SLIDE 22
FPGA platform
SLIDE 23 FPGA Platform :: MicroZed
l Xilinx XC7Z010 l USB 2.0 l Gbit Ethernet l 1 Gbyte SRAM DDR3 l 128 Mb Flash l Micro SD card l 100 I/O l Embedded Linux
SLIDE 24
4.2 CUDA platform
SLIDE 25 DRAM
Cache ALU Control ALU ALU ALU
DRAM
CPU GPU
CPU vs GPU
25
SLIDE 26
CUDA programming
SLIDE 27 FPGA vs Embedded CUDA
FPGA CUDA
Advantages
- Low power
- High performance
- Small size, possible
to migrate to VLSI
- Easy to program
- Speed up
- Floating point
Inconvenient
implement
- Long to learn
- Architecture
complexity vs speedup
parallel: core + memory use
SLIDE 28
l Image + Feature extraction in Camera l Form descriptor extraction at the camera level
(best for CUDA programing)
l Host computer or Cloud for high level
cognitive modeling / BigData techniques
l Network of cameras can open new research
possibilities
SLIDE 29 PARTIAL CORRESPONDENCE OF FORM
- Object recognition using form
- From a given object model, select a subset of
corresponding edge segments.
Clasificación de objetos en imágenes usando características de forma
SLIDE 30 PARTIAL CORRESPONDENCE OF FORM
Part of the contour can be connected incorrectly with the background or other object, giving a wrong edge to be matched
Clasificación de objetos en imágenes usando características de forma
CHALLENGE ¡
SLIDE 31
- Open contour
- Self-contained
- Rotation and translation invariant
OCTAR FORM DESCRIPTOR
SLIDE 32
PARTIAL CORRESPONDENCE OF FORM
FORM ¡DESCRIPTOR ¡
SLIDE 33 Clasificación de objetos en imágenes usando características de forma
PARTIAL CORRESPONDENCE OF FORM
OBJECT ¡LOCATION ¡
correspondence vote for the center of the
SLIDE 34 Clasificación de objetos en imágenes usando características de forma
OBJECT ¡LOCATION ¡
be part of the object
PARTIAL CORRESPONDENCE OF FORM
SLIDE 35
l FPGA based processing for low level feature
extraction
l Form descriptors and medium level processing is
better with CUDA based platform
l Potential to combine networks of cameras with
embedded vision processing and Cloud computing
SLIDE 36
Miguel Arias – Computer Sc. Dept. ariasmo@inaoep.mx
Laboratorio de Cómputo Reconfigurable y de Alto Desempeño