systems
play

Systems IEEE 754 Format Shankar Balachandran* Associate Professor, - PowerPoint PPT Presentation

Spring 2015 Week 9 Additional Module Digital Circuits and Systems IEEE 754 Format Shankar Balachandran* Associate Professor, CSE Department Indian Institute of Technology Madras *Currently a Visiting Professor at IIT Bombay There is no


  1. Spring 2015 Week 9 Additional Module Digital Circuits and Systems IEEE 754 Format Shankar Balachandran* Associate Professor, CSE Department Indian Institute of Technology Madras *Currently a Visiting Professor at IIT Bombay

  2.  There is no video corresponding to this file. 2

  3. Floating-Point Number Representation 1 p bits m bits S Exponent (E) Mantissa ( M )  Sign bit: S is the sign of the floating point number  Exponent: p -bit exponent ( E ) in excess- B code.  Mantissa: m -bit unsigned mantissa ( M ).  Radix: R is the radix for the representation.  Actual value of the number represented above is:  F = ( - 1 ) S × 1 .M × R E-B … (if normalized)  F = ( - 1 ) S × M × R E-B … (if unnormalized) Arithmetic Circuits 3

  4. IEEE Floating-Point Format  Single Precision Format: (32 bits) 1 8 bits 23 bits S Exponent (E) unsigned Mantissa ( M ) 31 30 23 22 0  F = (-1) S × 1 . M × 2 E -127  Special reserved values:  E = 0 with M = 0 represents ZERO o o  E = 255 with M = 0 represents ±  E = 255 with M ≠ 0 represents NaN  Double Precision Format: (64 bits) 1 11 bits 52 bits S Exponent (E) unsigned Mantissa ( M )  F = (-1) S × 1 . M × 2 E -1023 Arithmetic Circuits 4

  5. Examples  Convert 4 . 62 × 10 2 to IEEE single precision format: 4 . 62 × 10 2 = 462 = 111001110 . 0 = 1 . 110011100 × 2 8  Mantissa = 110011100 Exponent = 8+127 = 135 = 10000111 0 1000 0111 1100 1110 0000 0000 0000 000 = 0 87 CE0000  Convert -0 . 456 × 2 -3 to IEEE single precision format: -0 . 456 × 2 -3 = - 0.0111 0100 1011 1100 0110 1010 0111 1101 × 2 -3 = -1 . 1101 0010 1111 0001 1010 101 × 2 -5  Exponent = -5+127 = 122 = 01111010 1 0111 1010 1101 0010 1111 0001 1010 101 = 1 7A D2F1AA  Convert 1 81 99999A to decimal representation: Mantissa = -1 . 1001 1001 1001 1001 1001 1010 Exponent = 81 16 - 127 10 = -1 . 1001 1001 1001 1001 1001 1010 × 2 2 = -0110 . 0110 0110 0110 0110 0110 1 = -6 . 4 Arithmetic Circuits 5

  6. Floating-Point Addition How to add two floating-point numbers?  ( M a ,E a ) + ( M b ,E b ) Place number with larger exponent in register 1 and the other in 2. 1. d = E 1 – E 2 2. Right-shift mantissa M 2 by d bits (i.e., left-shift radix point by d bits). 3. M SUM = M 1 + M 2 ; E SUM =E 1 4. If ( M SUM ) ≥ 2, renormalize (post -normalization) by dividing by 2 5. (shifting right) and incrementing E SUM . Rounding may be required to store the result in the same number of 6. bits (precision) as the inputs. Result = ( M SUM , E SUM ). 7. Arithmetic Circuits 6

  7. Example: Addition  Add (11000000, 011) to (10101100, 100) ( input numbers are in the normalized form with excess-4 exponent ).  That is, (1.11000000 × 2 011-100 ) + (1.10101100 × 2 100-100 ) = ?  Since (100 > 011), M 1 = 1.10101100 E 1 = 100 and M 2 = 1.11000000 E 2 = 011  d = E 1 – E 2 = 100 – 011 = 001  Right-shift M 2 by 001 (1) bits  M 2 = 0.11100000 E 2 = 100  M SUM = 1.10101100 + 0.11100000 = 10.10001100 and E SUM = E 1 = 100  Post-normalize: M SUM = 1.01000110 and E SUM = 101  Therefore, (1.11000000 × 2 011-100 ) + (1.10101100 × 2 100-100 ) = ( 1.01000110 × 2 101-100 )  Or, (11000000, 011) + (10101100, 100) = ( 01000110, 101 ) Arithmetic Circuits 7

  8. Floating-Point Multiplication How to multiply two floating-point numbers?  ( M a , E a ) × ( M b , E b ) M PROD = M 1 × M 2 1. E PROD = E 1 + E 2 - bias 2. Post-normalize M PROD by shifting by an appropriate amount 3. and then updating E PROD by the same amount. Rounding may be required to store the result in the same 4. number of bits (precision) as the inputs. If necessary, normalize and update E PROD 5. Result = ( M PROD , E PROD ). 6. Arithmetic Circuits 8

  9. Example: Multiplication  Multiply (10101100, 0101) and (11000000, 0110) ( input numbers are in the normalized form excess-8 exponent ).  That is, (1.10101100 × 2 0101-1000 ) × (1.11000000 × 2 0110-1000 ) = ?  M PROD = 1.10101100 × 1.11000000 = 10.1110110100000000 and E PROD = E 1 + E 2 - 8 = 0011  Post-normalize: M PROD = 1.01110110100000000 and E PROD = 0100  Rounding : M PROD = 1.01110111  Therefore, (1.11000000 × 2 0101-1000 ) × (1.10101100 × 2 0110-1000 ) = ( 1.01110111 × 2 0100-1000 )  Or, (11000000, 0101) × (10101100, 0110) = ( 01110111, 0100 ) Arithmetic Circuits 9

  10. End of Week 9: Additional Module Thank You Multipliers+Others 10

Recommend


More recommend