Reelle tal, f.eks. 1/7 ≈ eller double float 32 bit 64 bit ” By default, the x87 processors all use 80-bit double-extended precision internally “ wikipedia.org/wiki/X87
Algorithms using “real” numbers Noter ch.4
double x = 0.0; assert ( x == 0.0 ); // C++ f( ) if ( x == 0.0) f( ) << do something smart >> else << do something stupid >> P.g.a. underlige compiler optimering og repræsentationer af reelle tal kan der faktisk ske ”something stupid”
double x = 1.0; assert ( x == 0.0 ); // C++ f( ) if ( x == 0.0) f( ) << do something smart >> else << do something stupid >> P.g.a. underlige compiler optimering og repræsentationer af reelle tal kan der faktisk ske ”something stupid”
Algorithms using ”real” numbers • Type double contains only finitely many values double d = 10; for (int i=0; i<10; i++) { System.out.println(d); d = d*d; }
Algorithms using ”real” numbers • Type double contains only finitely many values double d = .10; for (int i=0; i<10; i++) { System.out.println(d); d = d*d; }
Algorithms using ”real” numbers • Type double contains only finitely many values System.out.println( (1.0 / 49) * 49 ); }
Algorithms using “real” numbers • Representable numbers • Rounding/truncation • IEEE standard and Java • Summation order • Newton iteration • More iteration • Formula rewriting
Limited Precision • A computer can handle only a finite subset of the reals directly in hardware • These numbers make up a floating point number system F( β ,s,m,M) characterized by – a base � ∈ �\�1� – a number of digits � ∈ � – a smallest exponent � ∈ � – largest exponent � ∈ � • Each floating point number has the form � � . � � � with � � � � � and 0 � � � � � � 1 . The mantissa � � . � � � � … � � represents • � � � � � � �� � � � � �� � ・ ・ ・ � � � � ���� The exponent part is � � • The system includes 0 � 0.0 . . . 0 • All other numbers are normalised: 1 � � � � � � 1 •
Limited Precision • Example: the number system F(2, 3, − 2, 1) contains 33 numbers: In decimal: 0.25 = (1+0/2+0/4) / 4 3.5 = (1+1/2+1/4) * 2 0.25 3.5
QUIZ Limited precision A number x in the number system F(2, 3, − 2, 1) has the mantissa 110 and the exponent 1 . What number is x in the usual decimal base 10 system? 8% a) X 1. 1.5 62% b) X 2. 3 3. 12 c) X 14% 4. 11 d) X 10% 5. I don’t know e) X 7% mantissa 110 : binary number 1.10 Decimal number 1 + 1/2 + 0/4 = 1.5 mantissa 110 and exponent 1 : binary number 11.0 Decimal number 1*2 + 1 + 0/2 = 3
QUIZ Limited precision A number x in the number system F(2, 3, − 2, 1) has the mantissa 110 and the exponent 1 . What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know
QUIZ Limited precision A number x in the number system F(2, 3, − 2, 1) has the mantissa 110 and the exponent 1 . What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know mantissa 110 : binary number 1.10 Decimal number 1 + 1/2 + 0/4 = 1.5 mantissa 110 and exponent 1 : binary number 11.0 Decimal number 1*2 + 1 + 0/2 = 3
Algorithms using “real” numbers • Representable numbers • Rounding/truncation • IEEE standard and Java • Summation order • Newton iteration • More iteration • Formula rewriting
Rounding/truncation • Standard demo system is F(10, 4, − 99, 99) • 1.573 og 0.1824 are representable, but the following are not 1.573 � 0.1824 � 1.7554 1.573 � 0.1824 � 1.3906 1.573 � 0.1824 � 0.2869152 1.573 / 0.1824 � 8.6239035 . . . • Exact arithmetic on a finite subset of reals is not possible • Strategy for rounding/truncating to a representable number is needed
Rounding/truncation • rounding/truncation defined by function fl : R → M , where M is machine numbers • machine arithmetic operations ⊕ , ⊖ , ⊗ , ⊘ defined by � ⊕ � � ���� � �� etc. • In demo system F(10, 4, − 99, 99) define �� by truncation, e.g. 1.573 ⊕ 0.1824 � ���1.573 � 0.1824� � ���1.7554� � 1.755 1.573 ⊖ 0.1824 � ���1.573 � 0.1824� � ���1.3906� � 1.390 1.573 ⊗ 0.1824 � ���1.573 � 0.1824� � ���0.2869152� � .2869 1.573 ⊘ 0.1824 � ���1.573 / 0.1824� � ���8.6239035 . . . � � 8.623
Rounding/truncation • Algebraic laws invalid: �1.418 ⊕ 2937� ⊖ 2936 � 2938 ⊖ 2936 � 2.000 1.418 ⊕ �2937 ⊖ 2936� � 1.418 ⊕ 1.000 � 2.418 1.418 ⊗ �2001 ⊖ 2000� � 1.418 ⊗ 1.000 � 1.418 1.418 ⊗ 2001 ⊖ 1.418 ⊗ 2000 � 2837 ⊖ 2836 � 1.000 • The usual associative and distributive laws are not valid for machine arithmetic. Sometimes: �� ⊕ �� ⊖ � � � ⊕ �� ⊖ �� � ⊗ �� ⊖ �� � �� ⊗ �� ⊖ �� ⊗ ��
QUIZ Rounding / truncation What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4, − 99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know
QUIZ Rounding / truncation What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4, − 99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know (0.9996 ⊕ 0.9998) ⊘ 2 = fl(0.9996 + 0.9998) ⊘ 2 = fl(1.9994) ⊘ 2 = 1.999 ⊘ 2 = fl(1.999 / 2) = fl(0.9995) = 0.9995 Better expression for computing average: 0.9996 ⊕ (0.9998 ⊖ 0.9996) ⊘ 2 = 0.9997
QUIZ Rounding / truncation What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 77% in the number system F(10, 4, − 99, 99) using truncation? 1. 0.9995 a) X 8% 2. 0.9996 b) X 8% 3. 0.9997 c) X 4. 0.9998 5% d) X 5. I don’t know 3% e) X (0.9996 ⊕ 0.9998) ⊘ 2 = fl(0.9996 + 0.9998) ⊘ 2 = fl(1.9994) ⊘ 2 = 1.999 ⊘ 2 = fl(1.999 / 2) = fl(0.9995) = 0.9995 Better expression for computing average: 0.9996 ⊕ (0.9998 ⊖ 0.9996) ⊘ 2 = 0.9997
QUIZ Rounding / truncation m,s ∊ F(10,4, − 99,99) int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; It is a fact that � � � � � . � �→� �1 � lim The algorithm computes an approximation to � corresponding to � � 2 �� � 1024 . What is the result of execution in the number system F(10,4, − 99,99) with truncation, where fl( � ) = 2.718? 1. 0 2. 1.000 3. 2.591 4. 2.718 5. I don’t know
QUIZ Rounding / truncation m,s ∊ F(10,4, − 99,99) int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; � � 2 � � 2 It is a fact that � � 2 � � 4 � � � � � . � �→� �1 � lim … The algorithm computes an � � 2 � � 512 approximation to � corresponding to � � 2 �� � 1024 � � 2 �� � 1024 . What is the result of execution in the number system � � 1 � 1 � � 1⊕1⊘1024 F(10,4, − 99,99) with truncation, where fl( � ) = 2.718? � 1⊕0.0009765 1. 0 � �. ��� 2. 1.000 3. 2.591 � � � ∗ � � �. ��� 4. 2.718 … 5. I don’t know � � � ∗ � � �. ���
QUIZ Rounding / truncation m,s ∊ F(10,4, − 99,99) int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; � � 2 � � 2 It is a fact that � � 2 � � 4 � � � � � . � �→� �1 � lim … The algorithm computes an � � 2 � � 512 approximation to � corresponding to � � 2 �� � 1024 � � 2 �� � 1024 . What is the result of execution in the number system � � 1 � 1 � � 1⊕1⊘1024 F(10,4, − 99,99) with truncation, where fl( � ) = 2.718? � 1⊕0.0009765 7% 1. 0 � �. ��� a) X 86% 2. 1.000 b) X 2% 3. 2.591 � � � ∗ � � �. ��� c) X 4. 2.718 … 3% d) X 5. I don’t know � � � ∗ � � �. ��� 2% e) X
Algorithms using “real” numbers • Representable numbers • Rounding/truncation • IEEE standard and Java • Summation order • Newton iteration • More iteration • Formula rewriting
IEEE standard and Java • IEEE standard describes two number systems, approximately – Single precision F(2, 24, − 126, 127) – Double precision F(2, 53, − 1022, 1023) • The IEEE systems has more numbers: – closes the representational gap around zero with xtra numbers of the form 0. � � � � . . . � �� ∗ 2 ����� – has representations for �∞ and NaN (= Not a Number) • The IEEE system has detailed rules for rounding to representable numbers, ex. – overflow: 1/0 or 2 ���� ∗ 2 has result ∞ – underflow: 1/∞ or 2 �������� /2 has result 0 – 0/0 or ∞ � ∞ has result NaN
Recommend
More recommend