lecture 2 fixed point ieee floating point standard wed
play

lecture 2 - fixed point - IEEE floating point standard Wed. - PowerPoint PPT Presentation

lecture 2 - fixed point - IEEE floating point standard Wed. January 13, 2016 For those interested in finding out what research is all about, I encourage you to participate in studies such as these. Fixed point Fixed point means we have a


  1. lecture 2 - fixed point - IEEE floating point standard Wed. January 13, 2016

  2. For those interested in finding out what research is all about, I encourage you to participate in studies such as these.

  3. Fixed point Fixed point means we have a constant number of bits (or digits) to the left and right of the binary (or decimal) point. Examples : 23953223.49 (base 10) Currency uses a fixed number of digits to the right. 10.1101 (base 2)

  4. Two's complement for fixed point numbers e.g. 0110.1000 which is 6.5 in decimal How do we represent -6.5 in fixed point ? 0110.1000 1001.0111 <----- invert bits + 0000.0001 <----- add .0001 0000.0000 Thus, 1001.0111 <----- invert bits + 0000.0001 <----- add .0001 1001.1000 <----- answer: -6.5 in (signed) fixed point

  5. Scientific Notation (floating point) "Normalized" : one digit to the left of the decimal point.

  6. Scientific Notation in binary "Normalized" means one "1" bit to the left of the binary point. (Note that 0 cannot be represented this way.)

  7. sign "exponent" "significand" (also called "mantissa") How to represent this information ? How to represent the number 0 ?

  8. IEEE floating point standard (est. 1985) case 1: single precision (32 bits = 4 bytes) "exponent" sign "significand"

  9. Let's look at these three parts, and then examples. sign 0 for positive, 1 for negative "significand" You don't encode the "1" to the left of the binary point. Only encode the first 23 bits to the right of the binary point.

  10. exponent code exponent value reserved (explained soon) 00000000 -126 00000001 -125 00000010 - 124 00000011 : : : : This is not two's 0 01111111 complement ! 1 10000000 2 10000001 : : : : 127 11111110 reserved (explained soon) 11111111 unsigned exponent code = exponent value + "bias" (for 8 bits, bias is defined to be 127)

  11. Q: What is the largest positive normalized number ? (single precision) A:

  12. Q: What is the smallest positive normalized number ? (single precision) A:

  13. Exponent code 00000000 reserved for "denormalized" numbers belong to includes 0

  14. Dividing each power of 2 interval into 2^23 equal parts (same for negative real numbers). Note the power of 2 intervals themselves are equally spaced on a log scale.

  15. Exponent code 11111111 also reserved. if significand is all 0's then value is +- infinity (depending on sign bit) else value is NaN ("not a number") e.g. variable is declared but hasn't been assigned a value This is the stuff you put on an exam crib sheet. (Yes, you can bring a crib sheet for the quizzes.)

  16. Example: write 8.75 a single precision float (IEEE). First convert to binary.

  17. (8.75) 10 = (1.00011) 2 x 2^3 23 bit significand: 00011000000000000000000 exponent value: e = 3 exponent code = exponent value (e) + bias Thus, exponent code is unsigned 3 + 127. (130) 10 = (10000010) 2 So, the 32 bit representation is : 0 10000010 00011000000000000000000 0 10000010 00011000000000000000000 0 x 4 1 0 c 0 0 0 0

  18. Recall last lecture: 0.05 cannot be represented exactly. float x = 0; for ( int ct = 0; ct < 20; ct ++) { x += 1.0 / 20; System. out .println( x ); } 0.05 0.1 0.15 0.2 0.25 0.3 0.35000002 0.40000004 0.45000005 0.50000006 etc

  19. Floating Point Addition x = 1.00100100010000010100001 * 2^2 y = 1.10101000000000000101010 * 2^ {-3} x + y = ?

  20. Floating Point Addition x = 1.00100100010000010100001 * 2^2 y = 1.10101000000000000101010 * 2^ {-3} x + y = ? x = 1.0010010001000001010000100000 * 2^2 y = .0000110101000000000000101010 * 2^2 but the result x+y has more than 23 bits of significand

  21. How many digits (base 10) of precision can we represent with 23 bits (base 2) ?

  22. case 2: double precision (64 bits = 8 bytes) "exponent" sign "significand"

  23. exponent code exponent value unsigned exponent code = exponent value + bias For 11 bits, bias is defined to be 2^10 - 1 = 1023. reserved 00000000000 -1022 00000000001 -1021 00000000010 - 1020 00000000011 : : : : 0 01111111111 1 10000000000 2 10000000001 : : : : 1023 11111111110 reserved 11111111111

  24. Example (8.75) 10 = (1.00011) 2 x 2^3 significand (52 bits) = .0001100000000000000000000000000000.... exponent = 3, code using 11 bits: 3 + 1023 = 1026 = (10000000010) 2 double precision float (64 bits) 0 10000000010 00011000000000000000000000000... 0 x 4 0 2 1 8 0 0 0 0 0 000000

  25. Q: What is the largest positive normalized number ? (double precision) A:

  26. Approximation Errors (Java/C/...) double x = 0; for ( int ct=0; ct < 10; ct ++) { x += 1.0 / 10; System. out .println( x ); } 0.1 0.2 0.30000000000000004 0.4 0.5 0.6 0.7 0.7999999999999999 0.8999999999999999 0.9999999999999999

  27. How many digits of precision can we represent with 52 bits ? 52 bits covers about the same "range" as 16 digits. That is why the print out on the previous slide had up to (about) 16 digits to the right of the decimal point.

  28. Announcements - public web page (Course outline etc) - corequisite courses: COMP 206 (official) COMP 250 (unofficial ) It is not recommended to do 250+206+273 together. Rather, 250+206 only, or 206+273 only. - assignments, there will be 4 (not 3), logisim, each should take ~10 hours (still worth total of 30%) - waiting list issues (14 x 12 + 10 = 178 seats in room ) - quiz 1: may have to sit on stairs and use a book :/ (only 15 min)

Recommend


More recommend