Sine/Cosine using Sine/Cosine using CORDIC Algorithm CORDIC Algorithm Prof. Kris Gaj Gaj Prof. Kris Gaurav Doshi, Hiren Hiren Shah Shah Gaurav Doshi,
Outlines Outlines � Introduction Introduction � � Basic Idea Basic Idea � � CORDIC Principles CORDIC Principles � � Hardware Implementation Hardware Implementation � � FPGA & ASIC Results FPGA & ASIC Results � � Conclusion Conclusion �
Introduction Introduction � CORDIC ( CORDIC (COordinate COordinate Rotation Rotation DIgital DIgital � Computer) Computer) � Introduced in 1959 by Jack E. Introduced in 1959 by Jack E. Volder Volder � � Efficient to compute sin, Efficient to compute sin, cos cos, tan, , tan, sinh sinh, , � cosh, , tanh tanh cosh � Its an Hardware Efficient Algorithm Its an Hardware Efficient Algorithm � � Iterative Algorithm for Circular Rotation Iterative Algorithm for Circular Rotation � � No Multiplication No Multiplication � � Delay/Hardware cost comparable to Delay/Hardware cost comparable to � division or square rooting. division or square rooting.
Why CORDIC ? � How to evaluate trigonometric functions? How to evaluate trigonometric functions? � • Table lookup Table lookup • • Polynomial approximations Polynomial approximations • • CORDIC CORDIC • � Compared to other approaches, CORDIC is a clear winner when : • Hardware Multiplier is unavailable ( Hardware Multiplier is unavailable (eg eg. microcontroller) . microcontroller) • • You want to save the gates required to implement You want to save the gates required to implement • (eg ( eg. FPGA) . FPGA)
Basic Ideas Basic Ideas � Embedding of elementary function Embedding of elementary function � evaluation as a generalized rotation evaluation as a generalized rotation operation. operation. � Decompose rotation operation into Decompose rotation operation into � successive basic rotations. successive basic rotations. � Each basic rotation can be realized Each basic rotation can be realized � with shift- -and and- -add arithmetic add arithmetic with shift operations. operations.
CORDIC Principles CORDIC Principles Y sin φ • Basic idea � Rotate (1,0) by φ degrees φ X to get (x,y): x= cos( φ ), y= sin( φ ) cos φ Y • Rotation of any (x,y) vector: (x’,y’) ′ = φ − φ . cos( ) . sin( ) x x y (x,y) ′ = φ + φ . cos( ) . sin( ) y y x φ X • Rearrange as: ′ = φ − φ cos( ).[ . tan( )] x x y φ = sin( ) φ Note : tan( ) ′ = φ + φ cos( ).[ . tan( )] φ cos( ) y y x
Key Idea Key Idea Can com pute rotation φ in steps w here each step is of size
Expansion Vector K Expansion Vector K � K = K = ∏ ∏ cos cos φ φ depends on rotation depends on rotation � angle φ φ 1 , φ φ 2 , φ φ 3 …… φ φ n angle 1 , 2 , 3 …… n � Since same angles are rotated Since same angles are rotated � always K is a constant that can be always K is a constant that can be pre- -computed computed pre � K = 1.646760258121 K = 1.646760258121 �
Iterative rotations � di decision ( rotation m ode) � Zi is introduced to keep track of the angle that has been rotated (z0 = φ ) � di = -1 if z i < 0 = 1 otherw ise
α i Example: Rewriting Angles in Terms of α Example: Rewriting Angles in Terms of i Y 45° Find α Find α i i such that such that tan( tan( α α i i )=2 )=2 - i : (or, : (or, -i � � α i =tan - (2 - ) ) ) α i =tan -1 1 (2 -i i ) 30° Example: φ φ = =30.0° 30.0° Example: � � • Start with Start with α α 0 = 45.0 (> > 30.0 30.0) ) • 0 = 45.0 ( • 45.0 45.0 – – 26.6 = 26.6 = 18.4 18.4 (< < 30.0 30.0) ) • ( • • 18.4 18.4 + + 14.0 = 14.0 = 32.4 32.4 (> ( > 30.0 30.0) ) • • 32.4 32.4 – – 7.1 = 7.1 = 25.3 25.3 (< (< 30.0 30.0) ) X • 25.3 + 3.6 = 28.9 25.3 + 3.6 = 28.9 (< < 30.0 30.0) ) • ( • • 28.9 28.9 + + 1.8 = 30.7 1.8 = 30.7 (> ( > 30.0 30.0) ) • • . . . . . . � φ = 30.0 ≈ 45.0 – 26.6 + 14.0 – 7.1 + 3.6 + 1.8 – 0.9 + 0.4 – 0.2 + 0.1 = 30.1
Sequential/Iterative CORDIC
Cont.. Cont.. � Maximum number of Clock Cycles to Maximum number of Clock Cycles to � calculate output calculate output � Minimum Clock Period per Minimum Clock Period per itration itration � � Variable Shifters do not map well on Variable Shifters do not map well on � certain FPGA’s FPGA’s due to high Fan due to high Fan- -in in certain
Parallel/Cascaded CORDIC
Parallel CORDIC Cont.. Parallel CORDIC Cont.. � Combinational circuit Combinational circuit � � More Delay, but processing time is More Delay, but processing time is � reduced as compared to iterative reduced as compared to iterative circuit. circuit. � Shifters are of fixed shift, so they Shifters are of fixed shift, so they � can be implemented in the wiring. can be implemented in the wiring. � Constants can be hardwired instead Constants can be hardwired instead � of requiring storage space. of requiring storage space.
Pipeline Architecture Pipeline Architecture 32 Bit
Parallel Pipelined CORDIC Parallel Pipelined CORDIC � Parallel CORDIC can be pipelined by inserting Parallel CORDIC can be pipelined by inserting � registers between the adders stages. registers between the adders stages. � In most FPGA architectures there are already In most FPGA architectures there are already � registers present in each logic cell, so pipeline registers present in each logic cell, so pipeline registers has no hardware cost. registers has no hardware cost. � Number of stages after which pipeline register is Number of stages after which pipeline register is � inserted can be modeled, considering clock inserted can be modeled, considering clock frequency of system. frequency of system. � When operating at greater clock period power When operating at greater clock period power � consumption in later stages reduces due to lesser consumption in later stages reduces due to lesser switching activity in each clock period. switching activity in each clock period.
Redundant Addition Redundant Addition � Main delay in critical path of the Main delay in critical path of the � CORDIC iteration is that of the CORDIC iteration is that of the adder. adder. � To reduce this delay we can use To reduce this delay we can use � redundant adders. redundant adders. � In signed digit number system In signed digit number system � addition becomes carry free. addition becomes carry free.
Example Example � r = 10 , digit set [0,9] r = 10 , digit set [0,9] � 5 7 8 2 4 9 5 7 8 2 4 9 + 6 2 9 3 8 9 + 6 2 9 3 8 9 11 9 17 5 12 18 [0,18] 11 9 17 5 12 18 [0,18] 11 9 16 5 12 16 [0,16] 11 9 16 5 12 16 [0,16] 0 0 1 0 0 2 [0,2] 0 0 1 0 0 2 [0,2] 11 10 16 5 14 16 [0,16] 11 10 16 5 14 16 [0,16]
Example Example � r = 2, digit set [ r = 2, digit set [- -1,0,1] 1,0,1] � 1 0 0 - -1 7 in decimal 1 7 in decimal 1 0 0 1 0 - -1 0 6 in decimal 1 0 6 in decimal 1 0 1 0 - 1 0 -1 0 1 0 C C i i 0 0 0 1 - -1 1 U U i 0 0 0 1 i 1 0 - -1 1 1 1 - -1 13 in decimal 1 13 in decimal 1 0
Language, Platform, Tools Language, Platform, Tools � Language Language – – Verilog Verilog HDL HDL � � Platform Platform � • Xilinx FPGA Xilinx FPGA • • ASIC ASIC – – TSMC TSMC LibraryAldec LibraryAldec Active Active- -HDL HDL • � Tools Tools � • Aldec Aldec Active Active- -HDL, HDL, Synplify Synplify Pro, Xilinx ISE Pro, Xilinx ISE • (Windows Platform) (Windows Platform) • Cadence Cadence – – Verilog Verilog- -XL & XL & Simvision Simvision • • Synopsys Design Analyzer (Unix Platform) Synopsys Design Analyzer (Unix Platform) •
FPGA(3s200ft256) Results FPGA(3s200ft256) Results Sequential Parallel Parallel Pipeline Sequential Pipeline LUT 597 864 867 LUT 597 864 867 Gate 5833 10371 14536 Gate 5833 10371 14536 Count Count Path 9.7 9.7 Path 9.7 9.7 Delay Delay
ASIC (TSMC) Results ASIC (TSMC) Results Sequential Parallel Parallel Pipeline Pipeline Parallel Parallel Sequential – SD SD – Adder Adder Area 9138 39019 62937 356015 Area 9138 39019 62937 356015 Power 554µW 22.9m 3mW 27.4m Power 554µW 22.9m 3mW 27.4m W W W W Arrival 9.7 28.3 9.7 7.82 Arrival 9.7 28.3 9.7 7.82 Time Time
Clock Frequency Clock Frequency � FPGA FPGA – – 85 MHz 85 MHz � � ASIC ASIC – – 150Mhz 150Mhz �
Recommend
More recommend