Video 1: Intro to Floating point
(Unsigned) Fixed-point representation The numbers are stored with a fixed number of bits for the integer part and a fixed number of bits for the fractional part. Suppose we have 8 bits to store a real number, where 5 bits store the integer part and 3 bits store the fractional part: ::i÷÷ .¥¥¥÷:¥¥¥ . . 1 0 1 1 1.0 1 1 ! 2 !$ 2 !# 2 !" 2 $ 2 # 2 % 2 " 2 ! = ( O . 125 ) , o ( 00000 . 001 ) z Smallest number: . 875 ) , o . l l l ) z = ( 31 ( l l l l l Largest number:
(Unsigned) Fixed-point representation Suppose we have 64 bits to store a real number, where 32 bits store the integer part and 32 bits store the fractional part: 12-12-2 - 32 420 , ' 23 z $" $# , , " & 2 & + ' % & 2 (& " $" … " # " " " ! . % " % # % $ … % $# # = ' &'! &'" = " !" × 2 !" +" !# × 2 !# + ⋯ + " # × 2 # +' " × 2 $" +' % × 2 % + ⋯ + ' !% × 2 $!% 000,02--0-00%-01=1 2-32=10-9 Smallest number: E 109 . 1) z ( l l l . I 1 Largest number: . 1 . . . . ? ? ¥73
Fixed-point representation More bits on the fractional part? ÷÷:::÷÷÷÷% How can we decide where to locate the binary point? . More bits on the integer part? } 0.0625 . b. bzbzbc 5 bits → do , ( a.a.ao.b.b.5888.io # go.es . do . . -
(Unsigned) Fixed-point representation Range : difference between the largest and smallest numbers possible. More bits for the integer part ⟶ increase range Precision : smallest possible difference between any two numbers More bits for the fractional part ⟶ increase precision ! ! ! " ! # . # " # ! # $ ! ! " ! # . # " # ! # $ # % ! OR Wherever we put the binary point, there is a trade-off between the amount of range and precision. It can be hard to decide how much you need of each! Fix: Let the binary point “float”
Scientific Notation In scientific notation , a number can be expressed in the form * = ± , × 10 ) where , is a coefficient in the range 1 ≤ , < 10 and 2 is the exponent. - O 1165.7 = 1.1657 × 10 ! ④ - O 0.0004728 = 4.728 × 10 $& Eg Note how the decimal point “floats”!
Floating-point numbers A floating-point number can represent numbers of different order of magnitude (very large and very small) with the same number of fixed digits. In general, in the binary system, a floating number can be expressed as ! = ± $ × 2 $ = = 3 is the significand, normally a fractional value in the range [1.0,2.0) - , U ] ¥4,4 ] ME [ L 2 is the exponent →
Floating-point numbers bit leading fractional ( Numerical Form: 0-0=0 ! = ±$ × 2 + = ±' , . ' - ' . ' / … ' 0 × 2 + Fractional part of significand ( * digits) ! ! ∈ 0,1 Exponent range : * ∈ ,, . 000 Precision : p = 0 + 1
Video 2: Normalized floating point representation
Converting floating points Convert (39.6875) "! = 100111.1011 # into floating point representation 1. 001111011 × 25 0.1001111011 × 26
alized floating-point numbers No Normal M € 1403 leading bit y ✓ ✓ Normalized floating point numbers are expressed as ! = ± 1. ' - ' . ' / … ' 0 × 2 + = ± 1. 3 × 2 + I 0 where " is the fractional part of the significand, # is the exponent and $ ! ∈ 0,1 . ✓ Aoebibzb3b4 → p - 5 5bits \ ④ b,bzb3b4 " ↳ bit¥7 " hidden " t
Normalized floating-point numbers OD % = ± ( × 2 & = ± 1. # " # ! # $ … # ' × 2 & = ± 1. - × 2 & o mEE4 € • Exponent range : • Precision : e n t I p • Smallest positive normalized FP number: 1- OO.is#x2=/2I → exponent = ftp.z-l#JYfa0Tgeeht • Largest positive normalized FP number: x I - precision 1. sissyish
Normalized floating point number scale I .fx2m ME [ 40 ] - htt p T flow overflow over p .tl#ro?g0.Itia...i f under " ht −∞ +∞ l It'll .it ) l 0 , . ' - 2 2 - - gap ? ? gap
Floating-point numbers: Simple example O A ”toy” number system can be represented as * = ±1. % " % # ×2 ) w for + ∈ [−4,4] and ' ' ∈ {0,1} . n=2 - - m - 4 fi :% :3 . -3 - 2 I m M M=O m - . . - = - ' 1.00 × 2 1. 00 × 20=1 i : ' . 11 × 2 . 11 × 20=1.75 I I - 4 M m =-3 = - 2 M - l = M =
Floating-point numbers: Simple example A ”toy” number system can be represented as * = ±1. % " % # ×2 ) for + ∈ [−4,4] and ' ' ∈ {0,1} . 1.00 ! ×2 " = 1 1.00 ! ×2 ! = 4.0 1.00 ! ×2 $ = 2 1.01 ! ×2 " = 1.25 1.01 ! ×2 $ = 2.5 1.01 ! ×2 ! = 5.0 " 1.10 ! ×2 " = 1.5 1.10 ! ×2 $ = 3.0 1.10 ! ×2 ! = 6.0 1.11 ! ×2 " = 1.75 1.11 ! ×2 $ = 3.5 1.11 ! ×2 ! = 7.0 } ① 1.00 ! ×2 % = 8.0 1.00 ! ×2 #$ = 0.5 1.00 ! ×2 & = 16.0 } } 1.01 ! ×2 % = 10.0 1.01 ! ×2 #$ = 0.625 1.01 ! ×2 & = 20.0 0.125 4.0 2.0 1.10 ! ×2 % = 12.0 1.10 ! ×2 #$ = 0.75 1.10 ! ×2 & = 24.0 1.11 ! ×2 % = 14.0 1.11 ! ×2 #$ = 0.875 1.11 ! ×2 & = 28.0 ° foot's 1.00 ! ×2 #! = 0.25 1.00 ! ×2 #% = 0.125 1.00 ! ×2 #& = 0.0625 1.01 ! ×2 #! = 0.3125 1.01 ! ×2 #% = 0.15625 1.01 ! ×2 #& = 0.078125 1.10 ! ×2 #! = 0.375 1.10 ! ×2 #& = 0.09375 1.10 ! ×2 #% = 0.1875 1.11 ! ×2 #! = 0.4375 1.11 ! ×2 #& = 0.109375 1.11 ! ×2 #% = 0.21875 Same steps are performed to obtain the negative numbers. For simplicity, we will show only the positive numbers in this example.
¥ * = ±1. % " % # ×2 ) for + ∈ [−4,4] and ' ' ∈ {0,1} • Smallest normalized positive number: 2-4=0.0625 • Largest normalized positive number: ' ( I P ) = 28 zut - 2- 0=4 = htt =3 p
Machine epsilon Machine epsilon ( 1 % ): is defined as the distance (gap) between 1 and the • next larger floating point number. 0 4 = ±1. ' " ' % ×2 ( for + ∈ [−4,4] and ' ' ∈ {0,1} IEm=2T ↳ - l 0.25 1.25 Em = = . in general in , x # i. f : . 00 × 20 (1) co I . 0000 - . = - x I ① I . . 001 T.F.co#I-2-nx2o=2-n . 000 . -
Range of integer numbers Suppose you have this following normalized floating point representation: 4 = ±1. ' " ' % ×2 ( for + ∈ [−4,4] and ' ' ∈ {0,1} -48¥ What is the range of integer numbers that you can represent exactly? ? :O : 'o= ② . c. oooh . ÷ :÷÷÷÷÷÷¥÷ ⇒ (g) no ( 1001 )z= - ' =3 ( 1112=1.10 × 2 = 1.01 × 23=10 .
Recommend
More recommend