Chapter 8: Fast Convolution Keshab K. Parhi
Chapter 8 Fast Convolution • Introduction • Cook-Toom Algorithm and Modified Cook-Toom Algorithm • Winograd Algorithm and Modified Winograd Algorithm • Iterated Convolution • Cyclic Convolution • Design of Fast Convolution Algorithm by Inspection Chap. 8 2
Introduction • Fast Convolution : implementation of convolution algorithm using fewer multiplication operations by algorithmic strength reduction • Algorithmic Strength Reduction : Number of strong operations (such as multiplication operations) is reduced at the expense of an increase in the number of weak operations (such as addition operations). These are best suited for implementation using either programmable or dedicated hardware • Example: Reducing the multiplication complexity in complex number multiplication: – Assume (a+jb)(c+dj)=e+jf, it can be expressed using the matrix form, which requires 4 multiplications and 2 additions: − e c d a = ⋅ f d c b – However, the number of multiplications can be reduced to 3 at the expense of 3 − = − + − extra additions by using: ac bd a ( c d ) d ( a b ) + = + + − ad bc b ( c d ) d ( a b ) Chap. 8 3
– Rewrite it into matrix form, its coefficient matrix can be decomposed as the product of a 2X3( C ), a 3X3( H )and a 3X2( D ) matrix: − c d 0 0 1 0 1 0 1 e a = = ⋅ + ⋅ ⋅ = ⋅ ⋅ ⋅ s 0 c d 0 0 1 C H D x 0 1 1 f b − 0 0 1 1 d • Where C is a post-addition matrix (requires 2 additions), D is a pre-addition matrix (requires 1 addition), and H is a diagonal matrix (requires 2 additions to get its diagonal elements) – So, the arithmetic complexity is reduced to 3 multiplications and 3 additions (not including the additions in H matrix) • In this chapter we will discuss two well-known approaches to the design of fast short-length convolution algorithms: the Cook-Toom algorithm (based on Lagrange Interpolation ) and the Winograd Algorithm (based on the Chinese remainder theorem ) Chap. 8 4
Cook-Toom Algorithm • A linear convolution algorithm for polynomial multiplication based on the Lagrange Interpolation Theorem • Lagrange Interpolation Theorem: + β ,...., β f β n 1 ( ) Let be a set of distinct points, and let , for i 0 n i f ( p ) = 0, 1, …, n be given. There is exactly one polynomial of degree n or less f β β ( ) that has value when evaluated at for i = 0, 1, …, n. It is given by: i i ∏ − β ( p ) j n ∑ ≠ = β j i ( ) ( ) f p f ∏ β − β i ( ) = i 0 i j ≠ j i Chap. 8 5
• The application of Lagrange interpolation theorem into linear convolution { } = h h , h ,..., h − Consider an N-point sequence 0 1 N 1 { } = x x , x ,..., x − and an L-point sequence . The linear 0 1 L 1 convolution of h and x can be expressed in terms of polynomial = ⋅ s ( p ) h ( p ) x ( p ) multiplication as follows: where = − + + + N 1 h ( p ) h p ... h p h − N 1 1 0 − = + + + L 1 ( ) ... x p x p x p x − L 1 1 0 = + − + + + L N 2 s ( p ) s p ... s p s + − L N 2 1 0 + N − L 2 s ( p ) The output polynomial has degree and has + N − L 1 different points. Chap. 8 6
• (continued) + N − s ( p ) L 1 can be uniquely determined by its values at Let { } β β β + N − , ,..., L 1 + N − different points. be 0 1 L 2 { } s β = + − ( ) 0 , 1 ,..., 2 i L N different real numbers. If for are i s ( p ) known, then can be computed using the Lagrange interpolation theorem as: ∏ − β ( p ) + − j 2 L N ∑ ≠ = β j i s ( p ) s ( ) ∏ β − β i ( ) = 0 i i j ≠ j i It can be proved that this equation is the unique solution to compute linear s β ( ) s ( p ) i convolution for given the values of , for { } = + − i 0 , 1 ,..., L N 2 . Chap. 8 7
• Cook-Toom Algorithm (Algorithm Description) β β ⋅ ⋅ ⋅ β + N − 1 , , L 1. Choose different real numbers + − 0 1 L N 2 { } h β x β = ⋅ ⋅ ⋅ + − ( ) ( ) i 0 , 1 , , L N 2 2. Compute and , for i i { } β = β ⋅ β = ⋅ ⋅ ⋅ + − s ( ) h ( ) x ( ) i 0 , 1 , , L N 2 3. Compute , for i i i ∏ − β ( p ) j + − 2 L N ∑ ≠ = β j i ( ) ( ) s p s ∏ s ( p ) β − β i 4. Compute by using ( ) = i 0 i j ≠ j i • Algorithm Complexity – The goal of the fast-convolution algorithm is to reduce the multiplication complexity. So, if β i `s (i=0,1,…,L+N-2) are chosen properly, the computation in step-2 involves some additions and multiplications by small constants – The multiplications are only used in step-3 to compute s( β i ). So, only L+N-1 multiplications are needed Chap. 8 8
– By Cook-Toom algorithm, the number of multiplications is reduced from O(LN) to L+N-1 at the expense of an increase in the number of additions – An adder has much less area and computation time than a multiplier. So, the Cook-Toom algorithm can lead to large savings in hardware (VLSI) complexity and generate computationally efficient implementation • Example-1: (Example 8.2.1, p.230) Construct a 2X2 convolution algorithm using Cook-Toom algorithm with β ={0,1,-1} – Write 2X2 convolution in polynomial multiplication form as = + = + s(p)=h(p)x(p), where h ( p ) h h p x ( p ) x x p 0 1 0 1 = + + 2 s ( p ) s s p s p 0 1 2 – Direct implementation, which requires 4 multiplications and 1 additions, can be expressed in matrix form as follows: s h 0 0 0 x = ⋅ 0 s h h 1 1 0 x 1 s 0 h 2 1 Chap. 8 9
• Example-1 (continued) – Next we use C-T algorithm to get an efficient convolution implementation with reduced multiplication number β = β = β = 0 , h ( ) h , x ( ) x 0 0 0 0 0 β = β = + β = + 1 , h ( ) h h , x ( ) x x 1 1 0 1 1 0 1 β = β = − β = − 2 , h ( ) h h , x ( ) x x 2 2 0 1 2 0 1 – Then, s( β 0 ), s( β 1 ), and s( β 2 ) are calculated, by using 3 multiplications, as β = β β β = β β β = β β s ( ) h ( ) x ( ) s ( ) h ( ) x ( ) s ( ) h ( ) x ( ) 0 0 0 1 1 1 2 2 2 – From the Lagrange Interpolation theorem, we get: − β − β − β − β ( )( ) ( )( ) p p p p = β + β 1 2 0 1 ( ) ( ) ( ) s p s s 0 β − β β − β 1 β − β β − β ( )( ) ( )( ) 0 1 0 2 1 0 1 2 − β − β ( p )( p ) + β 0 1 s ( 2 ) β − β β − β ( )( ) 2 0 2 1 β − β β + β s ( ) s ( ) s ( ) s ( ) = β + + − β + 2 1 2 1 2 s ( ) p ( ) p ( s ( ) ) 0 0 2 2 = + + 2 s ps p s 0 1 2 Chap. 8 10
• Example-1 (continued) – The preceding computation leads to the following matrix form β 1 0 0 ( ) s s 0 0 = − ⋅ β s 0 1 1 s ( ) 2 1 1 − β s 1 1 1 s ( ) 2 2 2 β β 1 0 0 ( ) 0 0 ( ) h x 0 0 = − ⋅ β ⋅ β 0 1 1 0 h ( ) 2 0 x ( ) 1 1 − β β 1 1 1 0 0 h ( ) 2 x ( ) 2 2 1 0 0 h 0 0 1 0 0 x = − ⋅ + ⋅ ⋅ 0 0 1 1 0 ( ) 2 0 1 1 h h 0 1 x − − − 1 1 1 1 0 0 ( h h ) 2 1 1 0 1 – The computation is carried out as follows (5 additions, 3 multiplications) + − h h h h = = = 0 1 0 1 1 . H h , H , H (pre-computed) 0 0 1 2 2 2 = = + = − 2 . X x , X x x , X x x 0 0 1 0 1 2 0 1 = = = 3 . S H X , S H X , S H X 0 0 0 1 1 1 2 2 2 = = − = − + + 4 . , , s S s S S s S S S 0 0 1 1 2 2 0 1 2 Chap. 8 11
Recommend
More recommend