ECE232: Hardware Organization and Design Lecture 9: Floating Point - PowerPoint PPT Presentation

ECE232: Hardware Organization and Design Lecture 9: Floating Point Adapted from Computer Organization and Design , Patterson & Hennessy, UCB

Floating Point Representation for non-integral numbers  Including very small and very large numbers • Like scientific notation  – 2.34 × 10 56 • +0.002 × 10 – 4 • +987.02 × 10 9 • In binary  normalized ± 1. xxxxxxx 2 × 2 yyyy • Types float and double in C  ECE232: Floating Point 2

Floating Point Numbers The largest 32 bit unsigned integer number is  1111 1111 1111 1111 1111 1111 1111 1111 = 4,294,967,295 What if we want to encode the approx. age of the earth?  4,600,000,000 or 4.6 x 10 9 or the weight in kg of one a.m.u. (atomic mass unit)  0.0000000000000000000000000166 or 1.6 x 10 -27 There is no way we can encode either of the above in a 32-  bit integer. ECE232: Floating Point 3

Exponential Notation The following are equivalent representations of 1,234  123,400.0 x 10 -2 12,340.0 x 10 -1 The representations differ in that the decimal place – 1,234.0 x 10 0 the “point” - “floats” to the 123.4 x 10 1 left or right (with the 12.34 x 10 2 appropriate adjustment in 1.234 x 10 3 the exponent). 0.1234 x 10 4 0.01234x 10 5 ECE232: Floating Point 4

Parts of a Floating Point Number Exponent -0.9876 x 10 -3 Sign of exponent Sign of Location of Mantissa mantissa decimal point Base Mantissa is also called Significand ECE232: Floating Point 5

Single Precision Format Note that the exponent has no explicit sign bit  Base?  32 bits M: Mantissa (23 bits) E: Exponent (8 bits) S: Sign of mantissa (1 bit) ECE232: Floating Point 6

Normalization The mantissa M is a normalized fraction  Has an implied decimal place on left  Has an implied (hidden) “ 1 ” on left of the decimal place  E.g.,  • Fraction  10100000000000000000000 • Represents 1.101 2 = 1.625 10 The significand= 1.f is in the range [1, 2-ulp]  • ulp – unit in the last position     S E Bias ( 1 ) 1 . 2 F f ECE232: Floating Point 7

IEEE Floating-Point Format single: 8 bits single: 23 bits double: 11 bits double: 52 bits S Exponent Fraction       S (Exponent Bias) x ( 1) (1 Fraction) 2 S: sign bit (0  non-negative, 1  negative)  Normalize significand: 1.0 ≤ |significand| < 2.0  Always has a leading pre-binary-point 1 bit, so no need to • represent it explicitly (hidden bit) Significand is Fraction with the “1.” restored • Exponent: excess representation: actual exponent + Bias  Ensures exponent is unsigned • Single: Bias = 127; Double: Bias = 1203 • ECE232: Floating Point 8

Single-Precision Range Exponents 00000000 and 11111111 reserved  Smallest value  Exponent: 00000001 •  actual exponent = 1 – 127 = – 126 Fraction: 000…00  significand = 1.0 • ±1.0 × 2 – 126 ≈ ±1.2 × 10 – 38 • Largest value  exponent: 11111110 •  actual exponent = 254 – 127 = +127 Fraction: 111…11  significand ≈ 2.0 • ±2.0 × 2 +127 ≈ ±3.4 × 10 +38 • ECE232: Floating Point 9

Floating-Point Example Represent – 0.75  – 0.75 = ( – 1) 1 × 1.1 2 × 2 – 1 • S = 1 • Fraction = 1000…00 2 • Exponent = – 1 + Bias • • Single: – 1 + 127 = 126 = 01111110 2 • Double: – 1 + 1023 = 1022 = 01111111110 2 Single: 101111110 1000…00  Double: 101111111110 1000…00  ECE232: Floating Point 10

Floating-Point Example What number is represented by the single-precision float  110000001 01000…00 S = 1 • Fraction = 01000…00 2 • Fxponent = 10000001 2 = 129 • x = ( – 1) 1 × (1 + 01 2 ) × 2 (129 – 127)  = ( – 1) × 1.25 × 2 2 = – 5.0 ECE232: Floating Point 11

Floating-Point Addition Consider a 4-digit decimal example  9.999 × 10 1 + 1.610 × 10 – 1 • 1. Align decimal points  Shift number with smaller exponent • 9.999 × 10 1 + 0.016 × 10 1 • 2. Add significands  9.999 × 10 1 + 0.016 × 10 1 = 10.015 × 10 1 • 3. Normalize result & check for over/underflow  1.0015 × 10 2 • 4. Round and renormalize if necessary  1.002 × 10 2 • ECE232: Floating Point 12

Floating-Point Addition Now consider a 4-digit binary example  1.000 2 × 2 – 1 + – 1.110 2 × 2 – 2 (0.5 + – 0.4375) • 1. Align binary points  Shift number with smaller exponent • 1.000 2 × 2 – 1 + – 0.111 2 × 2 – 1 • 2. Add significands  1.000 2 × 2 – 1 + – 0.111 2 × 2 – 1 = 0.001 2 × 2 – 1 • 3. Normalize result & check for over/underflow  1.000 2 × 2 – 4 , with no over/underflow • 4. Round and renormalize if necessary  1.000 2 × 2 – 4 (no change) = 0.0625 • ECE232: Floating Point 13

Steps in Addition/Subtraction Step 1: Calculate difference d of the two exponents -  d=|E1 - E2| Step 2: Shift significand of smaller number by d positions to  the right Step 3: Add aligned significands and set exponent of result  to exponent of larger operand Step 4: Normalize resultant significand and adjust exponent  if necessary Step 5: Round resultant significand and adjust exponent if  necessary ECE232: Floating Point 14 Source: I. Koren, Computer Arithmetic Algorithms, 2nd Edition, 2002

Example: Single precision 0 10000010 11010000000000000000000 1.1101 2 130 – 127 = 3 0 = positive mantissa +1.1101 2 x 2 3 = 1110.1 2 = 14.5 10 ECE232: Floating Point 15

Converting to IEEE format Example - decimal number: -3.154 X 10 0  What is the sign?  What is the exponent?  What is the mantissa?  Converting Mixed Numbers – Decimal to Binary 456.78 10 = 4 x 10 2 + 5 x 10 1 + 6 x 10 0 + 7 x 10 -1 +8 x 10 -2 1011.11 2 = 1 x 2 3 + 0 x 2 2 + 1 x 2 1 + 1 x 2 0 + 1 x 2 -1 + 1 x 2 -2 = 8 + 0 + 2 + 1 + 1/2 + ¼ = 11 + 0.5 + 0.25 = 11.75 10 ECE232: Floating Point 16

How to convert whole Decimal to Binary Successive division by 2  1 57143 10 = 1101111100110111 2 1 1  3 0 6 1 13 1 27 1 55 1 111 1 223 0 446 0 892 1 1785 1 3571 0 7142 1 14285 1 28571 1 57143 ECE232: Floating Point 17

Converting fractional Decimal to Binary Successive multiplication by 2 12 0.784 0 0 0.154 13 1.568 1 1 0.308 0 14 1.136 1 2 0.616 0 15 0.272 0 3 1.232 1 16 0.544 0 4 0.464 0 17 1.088 1 5 0.928 0 18 0.176 0 6 1.856 1 19 0.352 0 7 1.712 1 20 0.704 0 8 1.424 1 21 1.408 1 9 0.848 0 22 0.816 0 10 1.696 1 11 1.392 1 23 1.632 1 Decimal 0.154 = .0010 0111 0110 1100 1000 101 ECE232: Floating Point 18

Floating Point Special Representations       S E 127 1 1 . f 2 ( 1 ) 1 . 2 F f  There are two Zeroes,  0, and two Infinities  ∞  NaN (Not-a-Number) may have a sign and have a non-zero fraction - used for program diagnostics  NaNs and Infinities have all 1s in the Exp field, E=255. F+  =  , F/  = 0 ECE232: Floating Point 19 Source: I. Koren, Computer Arithmetic Algorithms, 2nd Edition, 2002

Floating Point Special Representations       S E 127 1  E  254 1 1 . f 2 ( 1 ) 1 . 2 F f Single Precision Double Precision Object represented Exponent Fraction Exponent Fraction 0 0 0 0 0 0 nonzero 0 nonzero ± denormalized number 1-254 Anything 1-2046 Anything ± floating point number 255 0 2047 0 ± infinity 255 nonzero 2047 nonzero NaN (not a number) ECE232: Floating Point 20

Smallest & Largest Numbers The smallest non-zero positive and largest non-zero negative  normalized numbers (represented by 1 in the Exp field and 0…0 in the Fraction field) are ±2 −126 ≈ ±1.175494351×10 −38 • The smallest non-zero positive and largest non-zero negative  denormalized numbers (represented by all 0s in the Exp field and 0…01 in the Fraction field) are ±2 −149 ≈ ±1.4012985×10 −45 • The largest finite positive and smallest finite negative numbers  (represented by 254 in the Exp field and 1…1 in the Fraction field) are ±(2)(2 127 )≈ ±3.40×10 38 • ECE232: Floating Point 21

FP Adder Hardware Step 1 Step 2 Step 3 Step 4 ECE232: Floating Point 22

Single Precision Summary Type Exponent Mantissa Value Zero 0000 0000 000 0000 0000 0000 0000 0000 0 One 0111 1111 000 0000 0000 0000 0000 0000 1 Denormalized number 0000 0000 100 0000 0000 0000 0000 0000 5.9 × 10 -39 Largest normalized number 1111 1110 111 1111 1111 1111 1111 1111 3.4 × 10 38 Smallest normalized number 0000 0001 000 0000 0000 0000 0000 0000 1.18 × 10 -38 Infinity 1111 1111 000 0000 0000 0000 0000 0000 Infinity NaN 1111 1111 010 0000 0000 0000 0000 0000 NaN ECE232: Floating Point 23

Summary Floating point numbers represent large numbers with fractions  Number formats are different than 2’s complement.  Requires some memorization • Addition requires aligning, adding, and then realigning  Do examples!  The best way to learn floating point operations • ECE232: Floating Point 24

ECE232: Hardware Organization and Design Lecture 9: Floating Point - PowerPoint PPT Presentation

ECE232: Hardware Organization and Design Lecture 9: Floating Point Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Floating Point Representation for non-integral numbers Including very small and very large

ECE232: Hardware Organization and Design Lecture 7: Binary Numbers and Adders Adapted from

ECE232: Hardware Organization and Design Lecture 4: Logic Operations and Introduction to

ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer

ECE232: Hardware Organization and Design Lecture 29: Computer Input/Output Adapted from Computer

ECE232: Hardware Organization and Design Lecture 5: MIPs Decision-Making Instructions Adapted from

ECE232: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer

ECE232: Hardware Organization and Design Lecture 11: Introduction to MIPs Datapath Adapted from

ECE232: Hardware Organization and Design Lecture 28: More Virtual Memory Adapted from Computer

Hardware Observability Framework Hardware Observability Framework Hardware Observability

software and hardware for the Internet of Things. Choose hardware Design hardware Design

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Spark architecture Spark architecture Hardware organization Hardware organization In local

Flexible Hardware Design at Flexible Hardware Design at Low Levels of Abstraction Low Levels of

LibreCores Free and Open Digital Hardware Requirements Design Implementation Hardware

Floating Point Real numbers 3 . 14159 ( ) 0 . 00000000001 ( 1 . 0 10 9 ) 2 . 71828 ( e )

ADMIN Course paper topics due Fri Feb 24 via plain text email SI232 Set #10: More

Floa=ng-Point Numbers 2 Schedule Today Finish up

Floa=ng-Point Numbers 2 Schedule Today Homework #2

Chapter 2 Computer representation inspired by scientific notation Floating Point Numbers

Computer Organization & Assembly Language Programming (CSE 2312) Lecture 28: Course Review

SPFPTANGENTARCHITECTUREFORFPGAS Bogdan Pasca, Martin Langhammer Intel PSG Arithmetic in DSP

Floating point How arithmetic operations mathematics involving floating point numbers

ECE232: Hardware Organization and Design Lecture 9: Floating Point - PowerPoint PPT Presentation

ECE232: Hardware Organization and Design Lecture 9: Floating Point Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Floating Point Representation for non-integral numbers Including very small and very large

ECE232: Hardware Organization and Design Lecture 7: Binary Numbers and Adders Adapted from

ECE232: Hardware Organization and Design Lecture 4: Logic Operations and Introduction to

ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer

ECE232: Hardware Organization and Design Lecture 29: Computer Input/Output Adapted from Computer

ECE232: Hardware Organization and Design Lecture 5: MIPs Decision-Making Instructions Adapted from

ECE232: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer

ECE232: Hardware Organization and Design Lecture 11: Introduction to MIPs Datapath Adapted from

ECE232: Hardware Organization and Design Lecture 28: More Virtual Memory Adapted from Computer

Hardware Observability Framework Hardware Observability Framework Hardware Observability

software and hardware for the Internet of Things. Choose hardware Design hardware Design

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Spark architecture Spark architecture Hardware organization Hardware organization In local

Flexible Hardware Design at Flexible Hardware Design at Low Levels of Abstraction Low Levels of

LibreCores Free and Open Digital Hardware Requirements Design Implementation Hardware

Floating Point Real numbers 3 . 14159 ( ) 0 . 00000000001 ( 1 . 0 10 9 ) 2 . 71828 ( e )

ADMIN Course paper topics due Fri Feb 24 via plain text email SI232 Set #10: More

Floa=ng-Point Numbers 2 Schedule Today Finish up

Floa=ng-Point Numbers 2 Schedule Today Homework #2

Chapter 2 Computer representation inspired by scientific notation Floating Point Numbers

Computer Organization &amp; Assembly Language Programming (CSE 2312) Lecture 28: Course Review

SPFPTANGENTARCHITECTUREFORFPGAS Bogdan Pasca, Martin Langhammer Intel PSG Arithmetic in DSP

Floating point How arithmetic operations mathematics involving floating point numbers

Computer Organization & Assembly Language Programming (CSE 2312) Lecture 28: Course Review