Floating Point Numbers Prof. Usagi 2 Recap: CLA (cont.) All G and - PowerPoint PPT Presentation

Floating Point Numbers Prof. Usagi

Recap: CLA (cont.) • All “G” and “P” are immediately available (only need to look over Ai and Bi), but “c” are not (except the c0). G i = A i B i A 1 B 1 A 3 B 3 A 2 B 2 A 0 B 0 P i = A i XOR B i C 1 = G 0 + P 0 C 0 C 2 = G 1 + P 1 C 1 = G 1 + P 1 (G 0 + P 0 C 0 ) FA FA FA FA C 0 = G 1 + P 1 G 0 + P 1 P 0 C 0 C 3 = G 2 + P 2 C 2 P 3 G 3 C 3 P 2 G 2 C 2 P 1 G 1 C 1 P 0 G 0 = G 2 + P 2 G 1 + P 2 P 1 G 0 + P 2 P 1 P 0 C 0 Carry-lookahead Logic C out C 4 = G 3 + P 3 C 3 O 3 O 2 O 1 O 0 = G 3 + P 3 G 2 + P 3 P 2 G 1 + P 3 P 2 P 1 G 0 + P 3 P 2 P 1 P 0 C 0 3

Recap: CLA v.s. Carry-ripple • Size: • 32-bit CLA with 4-bit CLAs — requires 8 of 4-bit CLA • Each requires 116 for the CLA 4*(4*6+8) for the A+B — 244 gates • 1952 transistors Area-Delay Trade-off! • 32-bit CRA • 1600 transistors Win! • Delay • 32-bit CLA with 8 4-bit CLAs • 2 gates * 8 = 16 Win! • 32-bit CRA • 64 gates 4

Recap: Gate delay of 8 : 1 MUX A • What’s the estimated gate delay B of an 8 : 1 MUX? C A. 1 B. 2 D C. 4 Output E D. 8 F E. 16 G H 8 : 1 MUX 5 S 0 S 1 S 2

Recap: Shift “Right” Example: 0 Example: Example: A 3 A 2 A 1 A 0 if S = 11 if S = 10 if S = 01 then then then Y3 = 0 Y3 = 0 Y3 = 0 Y2 = 0 Y2 = 0 Y2 = A3 Y1 = 0 Y1 = A3 Y1 = A2 Y0 = A3 Y0 = A2 Y0 = A1 The “chain” of multiplexers 11 10 01 00 11 10 01 00 11 10 01 00 11 10 01 00 shamt MUX MUX MUX MUX determines how many bits to shift 2 Y 3 Y 2 Y 1 Y 0 Based on the value of the selection input (shamt = shift amount) 6

Recap: What’s after shift? • Assume we have a data type that stores 8-bit unsigned integer (e.g., unsigned char in C). How many of the following C statements and their execution results are correct? Statement C = ? I 0 1 c = 3; c = c >> 2; II 252 c = 255; c = c << 2; III 64 0 c = 256; c = c >> 2; IV 1 0 c = 128; c = c << 1; A. 0 B. 1 C. 2 D. 3 E. 4 7

8 https://www.reuters.com/article/us-global-oil-cftc-hamm/oil-exec-and-trump-ally-hamm-seeks-us-probe-of-oil-price-crash-idUSKCN2242UO

Outline • Representing a number with a decimal point • Floating point numbers • Floating point hardware 9

Poll close in Will the loop end? • Consider the following two C programs. X Y #include <stdio.h> #include <stdio.h> int main( int argc, char **argv) int main( int argc, char **argv) { { int i=0; float i=0.0; while (i >= 0) i++; while (i >= 0) i++; printf("We're done! %d\n", i); printf("We're done! %f\n",i); return 0; return 0; } } Please identify the correct statement. A. X will print “We’re done” and finish, but Y will not. B. X won’t print “We’re done” and won’t finish, but Y will. C. Both X and Y will print “We’re done” and finish D. Neither X nor Y will finish 10

Will the loop end? • Consider the following two C programs. X Y #include <stdio.h> #include <stdio.h> int main( int argc, char **argv) int main( int argc, char **argv) { { int i=0; float i=0.0; while (i >= 0) i++; while (i >= 0) i++; printf("We're done! %d\n", i); printf("We're done! %f\n",i); return 0; return 0; } } To know why — We need to figure out how “float” is handled in hardware! Please identify the correct statement. A. X will print “We’re done” and finish, but Y will not. B. X won’t print “We’re done” and won’t finish, but Y will. C. Both X and Y will print “We’re done” and finish D. Neither X nor Y will finish 11

Let’s revisit the 4-bit binary adding • 7 + 1 = ? 1 1 1 0 1 1 1 + 0 0 0 1 1 0 0 = -8 0 Sign bit • If you add the largest integer with 1, the result will become the smallest integer. 12

Representation of numbers with decimal points 13

“Floating” v.s. “Fixed” point • We want to express both a relational number’s “integer” and “fraction” parts • Fixed point • One bit is used for representing positive or negative • Fixed number of bits is used for the integer part • Fixed number of bits is used for the fraction part . • Therefore, the decimal point is fixed +/- Integer Fraction • Floating point is always here • One bit is used for representing positive or negative • A fixed number of bits is used for exponent • A fixed number of bits is used for fraction Can be anywhere in the fraction . • Therefore, the decimal point is floating — depending on the value of exponent +/- Exponent Fraction 14

Poll close in The advantage of floating/fixed point • Regarding the pros of floating point and fixed point expressions, please identify the correct statement A. Fixed point can be express wider range of numbers than floating point numbers, but the hardware design is more complex B. Floating point can be express wider range of numbers than floating point numbers, but the hardware design is more complex C. Fixed point can be express wider range of numbers than floating point numbers, and the hardware design is simpler D. Floating point can be express wider range of numbers than floating point numbers, and the hardware design is simpler 15

The advantage of floating/fixed point • Regarding the pros of floating point and fixed point expressions, please identify the correct statement A. Fixed point can be express wider range of numbers than floating point numbers, but the hardware design is more complex B. Floating point can be express wider range of numbers than floating point numbers, but the hardware design is more complex C. Fixed point can be express wider range of numbers than floating point numbers, and the hardware design is simpler D. Floating point can be express wider range of numbers than floating point numbers, and the hardware design is simpler 16

IEEE 32-bit floating point format 17

IEEE 754 format +/- Exponent (8-bit) Fraction (23-bit) 32-bit float • Realign the number into 1. F * 2 e • Exponent stores e + 127 • Fraction only stores F 18

Poll close in IEEE 754 format +/- Exponent (8-bit) Fraction (23-bit) 32-bit float • Realign the number into 1. F * 2 e • Exponent stores e + 127 • Fraction only stores F • Convert the following number 1 1000 0010 0100 0000 0000 0000 0000 000 A. - 1.010 * 2^130 B. -10 C. 10 D. 1.010 * 2^130 E. None of the above 19

IEEE 754 format +/- Exponent (8-bit) Fraction (23-bit) 32-bit float • Realign the number into 1. F * 2 e • Exponent stores e + 127 • Fraction only stores F • Convert the following number 1 1000 0010 0100 0000 0000 0000 0000 000 A. - 1.010 * 2^130 1 1000 0010 0100 0000 0000 0000 0000 000 B. -10 - e = 130 1.f = 1.01 = 1 + 0*2 -1 + 1* 2 -2 = 1.25 -127 = 3 C. 10 D. 1.010 * 2^130 1.25 * 2^3 = 10 E. None of the above 20

Floating point hardware 21

Floating point adder 22

Why — Will the loop end? • Consider the following two C programs. X Y #include <stdio.h> #include <stdio.h> int main( int argc, char **argv) int main( int argc, char **argv) { { int i=0; float i=0.0; while (i >= 0) i++; while (i >= 0) i++; printf("We're done! %d\n", i); printf("We're done! %f\n",i); return 0; return 0; } } Because Floating Point Hardware Handles “sign”, “exponent”, “mantissa” separately Please identify the correct statement. A. X will print “We’re done” and finish, but Y will not. B. X won’t print “We’re done” and won’t finish, but Y will. C. Both X and Y will print “We’re done” and finish D. Neither X nor Y will finish 23

Poll close in Comparing float and int • Comparing 32-bit floating point (float) and 32-bit integer, which of the following statement is correct? A. An int can represent more different numbers than float, but the maximum number a float can express is larger than int B. A float can represent more different numbers than float, but the maximum number an int can express is larger than float C. A float can represent more different numbers than int and the maximum number in float is larger than int D. A int can represent more different numbers than float and the maximum number in int is larger than float E. None of the above is correct 24

Maximum and minimum in float 1111 1111 = NaN 0 1111 1110 1111 1111 1111 1111 1111 111 254-127 =127 1.1111 1111 1111 1111 1111 111 = 340282346638528859811704183484516925440 = 3.40282346639e+38 max in int32 is 2^31-1 = 2147483647 But, this also means that float cannot express all possible numbers between its max/min — lose of precisions 25

Demo — what’s in c? #include <stdio.h> int main( int argc, char **argv) { float a, b, c; a = 1280.245; b = 0.0004; c = a + b; printf("1280.245 + 0.0004 = %f\n",c); return 0; } 26

Floating Point Numbers Prof. Usagi 2 Recap: CLA (cont.) All G and - PowerPoint PPT Presentation

Floating Point Numbers Prof. Usagi 2 Recap: CLA (cont.) All G and P are immediately available (only need to look over Ai and Bi), but c are not (except the c0). G i = A i B i A 1 B 1 A 3 B 3 A 2 B 2 A 0 B 0 P i = A i XOR B

Debugging Floating-Point Debugging Floating-Point Debugging Floating-Point Math in Racket Math

Machine numbers: how floating point numbers are stored? Floating-point number representation

Floating-point numbers Fractional binary numbers IEEE floating-point standard Floating-point

Formal verification of floating-point algorithms John Harrison Intel Corporation Floating

ECS 231 Computer Arithmetic 1 / 27 Outline Floating-point numbers and representations 1

Floating Point Representation CS3220 - Summer 2008 Jonathan Kaldor Floating Point Numbers

Lecture 3 Floating Point Representations 1 Floating-point arithmetic We often incur

Floating point Today ! IEEE Floating Point Standard ! Rounding ! Floating Point Operations !

9/20/2018 Today: Floating Point Background: Fractional binary numbers IEEE floating point

2/10/2020 Today: Floating Point Background: Fractional binary numbers IEEE floating point

Floating Point Real numbers 3 . 14159 ( ) 0 . 00000000001 ( 1 . 0 10 9 ) 2 . 71828 ( e )

Chapter 2 Computer representation inspired by scientific notation Floating Point Numbers

7. Floating-point Numbers II p 1 , the precision (number of places), e min , the smallest

Floating Point Numbers Philipp Koehn 7 November 2016 Philipp Koehn Computer Systems

15-213 The course that gives CMU its Zip! Floating Point Sept 6, 2006 Topics Topics

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

Hedge Funds: An Introduction Understanding a Critical Tool in the Global Economy M ANAGED F UNDS

Reforming the US Financial System: Dodd-Frank Wall Street Reform and Consumer Protection Act

A Brief Introduction to Prediction Markets Jake Abernethy, University of Michigan How do I find

Collabora'on Among Data Scien'sts, Sta's'cians, and Domain Experts With

Integrability and the Conformal Field Theory of the Higgs branch Bogdan Stefaski, jr. City

Are there hidden scalars in the LHC Higgs results? BURI2014 - University of Toyama 13 February

Playful game comparison and Absolute CGT Urban Larsson, Technion - Israel Institute of

Melbourne Tax Discussion Group 19 June 2019 Taxation of Cryptocurrencies What is blockchain?