Discrete Fourier Transform Convolution Fast Fourier Transform Parallel FFT Parallel Numerical Algorithms Chapter 6 – Matrix Models Section 6.1 – Fast Fourier Transform Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 1 / 34
Discrete Fourier Transform Convolution Fast Fourier Transform Parallel FFT Outline Discrete Fourier Transform 1 Roots of Unity DFT Inverse DFT Convolution 2 Problem Fast Fourier Transform 3 Computing DFT FFT Algorithm Parallel FFT 4 Binary Exchange Parallel FFT Transpose Parallel FFT Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 2 / 34
Discrete Fourier Transform Roots of Unity Convolution DFT Fast Fourier Transform Inverse DFT Parallel FFT Roots of Unity For given integer n , we use notation ω n = cos(2 π/n ) − i sin(2 π/n ) = e − 2 πi/n for primitive n th root of unity, where i = √− 1 n th roots of unity, sometimes called twiddle factors in this context, are then given by ω k n or by ω − k n , k = 0 , . . . , n − 1 For convenience, we will assume that n is power of two, and all logarithms used will be base two We will also index sequences (components of vectors) starting from 0 rather than 1 Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 3 / 34
Discrete Fourier Transform Roots of Unity Convolution DFT Fast Fourier Transform Inverse DFT Parallel FFT Discrete Fourier Transform Discrete Fourier Transform , or DFT , of sequence x = [ x 0 , . . . , x n − 1 ] T is sequence y = [ y 0 , . . . , y n − 1 ] T given by n − 1 � x k ω mk y m = n , m = 0 , 1 , . . . , n − 1 k =0 or y = F n x where entries of DFT matrix F n are given by { F n } mk = ω mk n Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 4 / 34
Discrete Fourier Transform Roots of Unity Convolution DFT Fast Fourier Transform Inverse DFT Parallel FFT Inverse DFT It is easily seen that F − 1 = (1 /n ) F H n n So inverse DFT is given by n − 1 x k = 1 � y m ω − mk k = 0 , 1 , . . . , n − 1 n n m =0 Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 5 / 34
Discrete Fourier Transform Roots of Unity Convolution DFT Fast Fourier Transform Inverse DFT Parallel FFT Example 1 1 1 1 1 1 1 1 ω 1 ω 2 ω 3 1 1 − i − 1 i F 4 = = ω 2 ω 4 ω 6 1 1 − 1 1 − 1 ω 3 ω 6 ω 9 1 1 i − 1 − i 1 1 1 1 1 1 1 1 ω − 1 ω − 2 ω − 3 1 1 i − 1 − i 4 F − 1 = = ω − 2 ω − 4 ω − 6 4 1 1 − 1 1 − 1 ω − 3 ω − 6 ω − 9 1 1 − i − 1 i Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 6 / 34
Discrete Fourier Transform Convolution Problem Fast Fourier Transform Parallel FFT Convolution Convolution takes input a and b and computes c k � ∀ k ∈ [0 , n − 1] c k = a j b k − j j =0 If a and b are coefficients of degree n/ 2 − 1 polynomials n/ 2 − 1 n/ 2 − 1 � � a k x k , b k x k p a ( x ) = p b ( x ) = k =0 k =0 the convolution computes the coefficients c of the product n − 1 � c k x k p c ( x ) = p a ( x ) p b ( x ) = k =0 naive evaluation costs O ( n 2 ) operations Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 7 / 34
Discrete Fourier Transform Convolution Problem Fast Fourier Transform Parallel FFT Convolution and Toeplitz Matrices Convolution can be interpreted as matrix-vector multiplication with a triangular Toeplitz matrix b 0 b 1 b 2 b 3 0 b 0 b 1 b 2 [ c 0 c 1 c 2 c 3 ] = [ a 1 a 2 a 3 a 4 ] 0 0 b 0 b 1 0 0 0 b 0 Toeplitz and Hankel matrices (in the latter, each antidiagonal is defined by a single element) provide a general matrix representation for convolutional operators Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 8 / 34
Discrete Fourier Transform Convolution Problem Fast Fourier Transform Parallel FFT Convolution via Interpolation by DFT The DFT, F n a evaluates polynomial p a at each ω j The values of p c at each ω j are then easily obtained p c ( ω j ) = p a ( ω j ) p b ( ω j ) The inverse DFT, F − 1 n p c ( x ) interpolates the values of the polynomial p c at each ω j producing its coefficients c The overall procedure is described by c = F − 1 n [( F n a ) ⊙ ( F n b )] where ⊙ is an elementwise product ( a and b are padded with trailing zeros) Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 9 / 34
Discrete Fourier Transform Convolution Problem Fast Fourier Transform Parallel FFT Convolution via DFT Lets write out the full expression c k = 1 � � �� � � � ω − ks ω sj ω st n a j n b t n n s j t Rearrange the order of the summations to see what happens to every product of a and b c k = 1 � � � ω ( j + t − k ) s a j b t n n s j t For any u = j + t − k � = 0 , we observe � n ) s = 0 s ( ω u When j + t − k = 0 the products ω ( s + t − j ) k = 1 , so there are n n nonzero terms a j b k − j in the summation Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 10 / 34
Discrete Fourier Transform Computing DFT Convolution Fast Fourier Transform FFT Algorithm Parallel FFT Computing DFT To illustrate, consider computing DFT for n = 4 , 3 � x k ω mk y m = n , m = 0 , . . . , 3 k =0 Writing out equations in full, x 0 ω 0 n + x 1 ω 0 n + x 2 ω 0 n + x 3 ω 0 y 0 = n x 0 ω 0 n + x 1 ω 1 n + x 2 ω 2 n + x 3 ω 3 y 1 = n x 0 ω 0 n + x 1 ω 2 n + x 2 ω 4 n + x 3 ω 6 y 2 = n x 0 ω 0 n + x 1 ω 3 n + x 2 ω 6 n + x 3 ω 9 y 3 = n Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 11 / 34
Discrete Fourier Transform Computing DFT Convolution Fast Fourier Transform FFT Algorithm Parallel FFT Computing DFT Noting that ω 0 n = ω 4 ω 2 n = ω 6 ω 9 n = ω 1 n = 1 , n = − 1 , n and regrouping, we obtain ( x 0 + ω 0 n x 2 ) + ω 0 n ( x 1 + ω 0 y 0 = n x 3 ) ( x 0 − ω 0 n x 2 ) + ω 1 n ( x 1 − ω 0 y 1 = n x 3 ) ( x 0 + ω 0 n x 2 ) + ω 2 n ( x 1 + ω 0 y 2 = n x 3 ) ( x 0 − ω 0 n x 2 ) + ω 3 n ( x 1 − ω 0 y 3 = n x 3 ) DFT can now be computed with only 8 additions and 6 multiplications, instead of expected (4 − 1) ∗ 4 = 12 additions and 4 2 = 16 multiplications Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 12 / 34
Discrete Fourier Transform Computing DFT Convolution Fast Fourier Transform FFT Algorithm Parallel FFT Computing DFT Actually, even fewer multiplications are required for this small case, since ω 0 n = 1 , but we have tried to illustrate how algorithm works in general Main point is that computing DFT of original 4 -point sequence has been reduced to computing DFT of its two 2 -point even and odd subsequences This property holds in general: DFT of n -point sequence can be computed by breaking it into two DFTs of half length, provided n is even Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 13 / 34
Discrete Fourier Transform Computing DFT Convolution Fast Fourier Transform FFT Algorithm Parallel FFT Computing DFT General pattern becomes clearer when viewed in terms of first few Fourier matrices 1 1 1 1 � 1 � 1 1 − i − 1 i F 1 = 1 , F 2 = , F 4 = 1 − 1 1 − 1 1 − 1 1 i − 1 − i Let P 4 be permutation matrix 1 0 0 0 0 0 1 0 P 4 = 0 1 0 0 0 0 0 1 Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 14 / 34
Discrete Fourier Transform Computing DFT Convolution Fast Fourier Transform FFT Algorithm Parallel FFT Computing DFT Let D 2 be diagonal matrix � 1 � 0 D 2 = diag(1 , ω 4 ) = 0 − i Then we have 1 1 1 1 � F 2 � 1 − 1 − i i D 2 F 2 F 4 P 4 = = 1 1 − 1 − 1 − D 2 F 2 F 2 1 − 1 i − i Thus, F 4 can be rearranged so that each block is diagonally scaled version of F 2 Such hierarchical splitting can be carried out at each level, provided number of points is even Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 15 / 34
Discrete Fourier Transform Computing DFT Convolution Fast Fourier Transform FFT Algorithm Parallel FFT Computing DFT In general, P n is permutation that groups even-numbered columns of F n before odd-numbered columns, and � � 1 , ω n , . . . , ω ( n/ 2) − 1 D n/ 2 = diag n To apply F n to sequence of length n , we need merely apply F n/ 2 to its even and odd subsequences and scale results, where necessary, by ± D n/ 2 Resulting recursive divide-and-conquer algorithm for computing DFT is called Fast Fourier Transform , or FFT FFT is particular way of computing DFT efficiently Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 16 / 34
Recommend
More recommend