HPEC-2 0 0 7 DUKE MI T-LL FFTs of Arbitrary Dimensions on GPUs Xiaobai Sun and Nikos Pitsianis Duke University September 19, 2007 At High Performance Embedded Computing 2007 MIT-LL
HPEC-2 0 0 7 DUKE MI T-LL Overview • Motivation – FFTs of arbitrary dimensions and their applications – Graphics processing units (GPUs) • Basic facts on dimensionality • FFTs on GPUs 1. 2D FFT is chosen as the primitive one at API level 2. 2D FFT performance is conveyed to FFTs of other dimensions • Experimental results • Discussion of related issues and works Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 2
HPEC-2 0 0 7 DUKE MI T-LL Motivation : FFT Applications Polar Polar Motion Format Motion Format Comp NUFFT Comp NUFFT Spatially Spatially Auto- Remove Auto- Remove Variant Focus Keystone Variant Focus Keystone Refocus Refocus From S. Bellofiore and H. Schmitt at Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 3
HPEC-2 0 0 7 DUKE MI T-LL Polar Format 2-D FFT Range PRFs 1800 2048 Synthetic Aperture Samples 1800 2048 1:1 Range Δθ 1.25:1 Azimuth Slant Plane Δθ I m a g e d e g r a d a t i o n b y t r a d i t i o n a l i n t e r p o l a t i o n p r i o r i t o 2 - D F F T CRP – L o s s o f r e s o l u t i o n – L o s s o f d a t a Ground Plane From S. Bellofiore and H. Schmitt at Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 4
HPEC-2 0 0 7 DUKE MI T-LL Motivation : GPU Architecture GPU : Graphics Processing Unit • Highly parallel multi-processors • Affordable commodity product • Initially dedicated to graphics processing and rendering • Presently capable of co-processing on Desktop, Laptop • Increasing programmability and API support • Image processing & rendering • GP-GPU Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 5
HPEC-2 0 0 7 DUKE MI T-LL Basic Facts on Dimensionality 1. In mathematics, FFTs are considered dimensionless in the sense that the factorizations can be described in a unified, recursive representation with provided scaling factors some of which are dimension dependent. In computation, trivial scaling may be skipped. 2. In application, it is often required that phase-frequency information, or spatial and geometric relation, be provided explicitly at input and output. FFT data are not shapeless . 3. In architecture, extra dimensions are induced by the data access patterns most efficiently supported . FFT data are transfigured at different memory level . GPUs support fine-granularity, 2D access to memory frames at the API level Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 6
HPEC-2 0 0 7 DUKE MI T-LL 2D FFT as the Primitive • 2D FFT • 2D data placement – Two complex numbers per pixel vector (4 floating point numbers) : one at the front, one at the back – Even columns at the front layers, odd columns at the back layers • 2D array operations through – utilizing best the architectural support of 2D data access at API level – Radix-2, radix 3 and mixed radices • Direct 2D bit-reversal – Up to certain sub-array size – 2D data partitioning in large data array Re( X ) Im( X ) Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 7
HPEC-2 0 0 7 DUKE MI T-LL Radix 2, Radix 3 and Mixed Radices Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 8
HPEC-2 0 0 7 DUKE MI T-LL Direct Two Dimensional Bit Reversal X( R m (i), R n (j) ) X( i, j ) • Not one dimension after another • Not recursion up to certain frame block size (low bits) • For large data size, block swaps (bit reversal in high bits) Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 9
HPEC-2 0 0 7 DUKE MI T-LL 2D Bit Reversal 2K 2K 2K 2K 19.6 ms 30.4 ms Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 10
HPEC-2 0 0 7 DUKE MI T-LL 2D Bit Reversal 2K 17.5 ms 1K 9.8 ms 14.8 ms Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 11
HPEC-2 0 0 7 DUKE MI T-LL 2D FF T T im es 40 0 Arithm etic Bit R ev ersa l 35 0 30 0 25 0 Time in msec 20 0 15 0 10 0 50 0 18 19 20 21 22 23 log 2 of data v olum e Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 12
HPEC-2 0 0 7 DUKE MI T-LL T otal 2D FF T T im es 60 0 Write to GP U Arithm etic 50 0 Bit R ev ersa l Re ad from G PU 40 0 Time in msec 30 0 20 0 10 0 0 18 19 20 21 22 23 log 2 of data v olum e Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 13
HPEC-2 0 0 7 DUKE MI T-LL FFTs of Other Dimensions • 1D FFT of size n Add a scaling stage in 2D FFT • 3D FFT of dimensions ( In a simple case ) Skip a scaling stage in 2D FFT Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 14
HPEC-2 0 0 7 DUKE MI T-LL 1D, 2D and 3D FFT s 45 0 1-D FF T 40 0 2-D FF T 3-D FF T 35 0 Compute time in msec 30 0 25 0 20 0 15 0 10 0 50 0 18 19 20 21 22 23 log 2 of data v olum e Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 15
HPEC-2 0 0 7 DUKE MI T-LL Other Issues and Works • Twiddle factors : – Pre-calculated, partially calculated, calculate on the fly – Numerical behavior • Data loading and unloading – Data placement in main memory – A sequence of successive FFTs • Automated tuning • Other commodity products – IBM Cell • FPGAs Sept. 19, 2007 FFTs of Arbitrary Dimensions on GPUs 16
Recommend
More recommend