Adapting a SDR environment to GPU architectures 06/22/2011 - 06/24/2011 SDR’11 - WinnComm - Europe Pierre-Henri Horrein Fr´ ed´ eric P´ etrot (TIMA) Christine Hennebert
Contents Context and aim Approaches Results Conclusion Context and aim 1 2 Approaches Results 3 Conclusion 4 SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 2
Outline Context and aim Approaches Results Conclusion 1 Context and aim 2 Approaches 3 Results 4 Conclusion SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 3
OpenCL architecture Context and aim Approaches Results Conclusion Centralized management on host SIMD architecture: same kernels applied on large vectors SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 4
GNURadio Context and aim Approaches Results Conclusion SDR framework Provides: • a large set of SDR basic operations • runtime management of the operations • I/O integration (Ettus Research, audio, . . . ) IQ Samples Applications T ask 2 T ask 3 T ask 0 T ask 1 T ask 4 T ask 5 SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 5
Aim Context and aim Approaches Results Conclusion Host ?? Comp. Dev. Comp. Dev. CU CU CU CU CU CU PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 6
Outline Context and aim Approaches Results Conclusion 1 Context and aim 2 Approaches 3 Results 4 Conclusion SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 7
Straightforward approach Context and aim Approaches Results Conclusion Use GPU as a single very efficient CPU Per-block optimization Efficient for some operations on very large data set SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 8
Mapping to GPU : parallelism Context and aim Approaches Results Conclusion Use each PE as a small CPU Apply an optimized sequential operation on each data set Lauch operation on multiple data sets Efficient for streaming applications, requires more memory SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 9
Outline Context and aim Approaches Results Conclusion 1 Context and aim 2 Approaches 3 Results 4 Conclusion SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 10
Test platform and method Context and aim Approaches Results Conclusion Test platform Intel Core i5 760 CPU (4 cores, 2.8GHz, 8MB cache) 4GB DDR3 memory Linux 2.6.36 kernel NVidia GTS 450 GPU, Asus DirectCU Card, 1GB DDR5 memory Method 3 single operations: • FFT • IIR • Mapping Sequences of operations SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 11
FFT Context and aim Approaches Results Conclusion Straightforward solution 2500 CPU OO inefficient on considered vector sizes 2000 Small gain for GPU solution 1500 time(ms) Data transfer reduces performance 1000 GPU monitoring : 500 • 10% for straightforward solution 0 5 6 7 8 9 10 11 12 13 • 98% for parallel solution N SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 12
IIR Context and aim Approaches Results Conclusion 2500 CPU OO 2000 No optimized algorithm for straightforward solution 1500 time(ms) ∼ 50% gain for GPU solution 1000 High block size requires more memory 500 0 5 6 7 8 9 10 11 12 13 N SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 13
Demapping Context and aim Approaches Results Conclusion 2500 CPU OO 2000 No need for high processing power 1500 time(ms) → GPU core is sufficient 1000 Very efficient on GPU, even for large data set 500 0 5 6 7 8 9 10 11 12 13 N SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 14
Multitasking Context and aim Approaches Results Conclusion 3500 CPU OO 3000 No multitasking on GPU: 2500 sequential execution 2000 Issue on buffer management time(ms) 1500 reduces performance 1000 20% gain for 4 tasks for size 1024 500 0 5 6 7 8 9 10 11 12 13 N SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 15
Outline Context and aim Approaches Results Conclusion 1 Context and aim 2 Approaches 3 Results 4 Conclusion SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 16
Conclusion and perspectives Context and aim Approaches Results Conclusion Contributions Study of two possible solutions for GPU integration • an existing solution, with disappointing results • a new solution for streaming application, with promising performance Perspectives Resolve the buffer management issue Experiment in a real radio application SDR’11 - WinnComm - Europe - Pierre-Henri Horrein | 17
Recommend
More recommend