Ev aluating Hyp otheses [Read Ch. 5] [Recommended exercises: 5.2, 5.3, 5.4] � Sample error, true error � Con�dence in terv als for observ ed h yp othesis error � Estimators � Binomial distribution, Normal distribution, Cen tral Limit Theorem � P aired t tests � Comparing learning metho ds 74 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Tw o De�nitions of Error The true error of h yp othesis h with resp ect to target function f and distribution D is the probabilit y that h will misclassify an instance dra wn at random according to D . er r or ( h ) � Pr [ f ( x ) 6 = h ( x )] D x 2D The sample error of h with resp ect to target function f and data sample S is the prop ortion of examples h misclassi�es 1 X er r or ( h ) � � ( f ( x ) 6 = h ( x )) S n x 2 S Where � ( f ( x ) 6 = h ( x )) is 1 if f ( x ) 6 = h ( x ), and 0 otherwise. Ho w w ell do es er r or ( h ) estimate er r or ( h )? S D 75 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Problems Estimating Error 1. Bias: If S is training set, er r or ( h ) is S optimisticall y biased bias � E [ er r or ( h )] � er r or ( h ) S D F or un biased estimate, h and S m ust b e c hosen indep enden tly 2. V arianc e: Ev en with un biased S , er r or ( h ) ma y S still vary from er r or ( h ) D 76 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Example Hyp othesis h misclassi�es 12 of the 40 examples in S 12 er r or ( h ) = = : 30 S 40 What is er r or ( h )? D 77 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Estimators Exp erimen t: 1. c ho ose sample S of size n according to distribution D 2. measure er r or ( h ) S er r or ( h ) is a random v ariable (i.e., result of an S exp erimen t) er r or ( h ) is an un biased estimator for er r or ( h ) S D Giv en observ ed er r or ( h ) what can w e conclude S ab out er r or ( h )? D 78 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Con�dence In terv als If � S con tains n examples, dra wn indep enden tly of h and eac h other � n � 30 Then � With appro ximately 95% probabilit y , er r or ( h ) D lies in in terv al v u u u er r or ( h )(1 � er r or ( h )) S S u t er r or ( h ) � 1 : 96 S n 79 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Con�dence In terv als If � S con tains n examples, dra wn indep enden tly of h and eac h other � n � 30 Then � With appro ximately N% probabilit y , er r or ( h ) D lies in in terv al v u u u er r or ( h )(1 � er r or ( h )) S S u t er r or ( h ) � z S N n where N %: 50% 68% 80% 90% 95% 98% 99% z : 0.67 1.00 1.28 1.64 1.96 2.33 2.58 N 80 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
er r or ( h ) is a Random V ariable S Rerun the exp erimen t with di�eren t randomly dra wn S (of size n ) Probabilit y of observing r misclassi�ed examples: n ! r n � r P ( r ) = er r or ( h ) (1 � er r or ( h )) D D r !( n � r )! Binomial distribution for n = 40, p = 0.3 0.14 0.12 0.1 0.08 P(r) 0.06 0.04 0.02 0 0 5 10 15 20 25 30 35 40 81 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Binomial Probabilit y Distributi on n ! r n � r P ( r ) = p (1 � p ) r !( n � r )! Probabilit y P ( r ) of r heads in n coin �ips, if Binomial distribution for n = 40, p = 0.3 0.14 p = Pr ( heads ) 0.12 0.1 � Exp ected, or mean v alue of X , E [ X ], is 0.08 P(r) n X 0.06 E [ X ] � iP ( i ) = np i =0 0.04 0.02 � V ariance of X is 0 2 0 5 10 15 20 25 30 35 40 V ar ( X ) � E [( X � E [ X ]) ] = np (1 � p ) � Standard deviation of X , � , is X r r 2 � � E [( X � E [ X ]) ] = np (1 � p ) X 82 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Normal Distributi on Appro ximates Bino- mial er r or ( h ) follo ws a Binomial distribution, with S � mean � = er r or ( h ) er r or ( h ) D S � standard deviation � er r or ( h ) S v u u u er r or ( h )(1 � er r or ( h )) D D u t � = er r or ( h ) S n Appro ximate this b y a Normal distribution with � mean � = er r or ( h ) er r or ( h ) D S � standard deviation � er r or ( h ) S v u u u er r or ( h )(1 � er r or ( h )) S S u t � � er r or ( h ) S n 83 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Normal Probabilit y Distributi on 1 x � � 1 2 � ( ) 2 � p p ( x ) = e 2 2 � � The probabilit y that X will fall in to the in terv al ( a; b ) is giv en b y Normal distribution with mean 0, standard deviation 1 Z 0.4 b p ( x ) dx 0.35 a 0.3 � 0.25 Exp ected, or mean v alue of X , E [ X ], is 0.2 0.15 E [ X ] = � 0.1 0.05 � V ariance of X is 0 -3 -2 -1 0 1 2 3 2 V ar ( X ) = � � Standard deviation of X , � , is X � = � X 84 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Normal Probabilit y Distributi on 80% of area (probabilit y) lies in � � 1 : 28 � N% of area (probabilit y) lies in � � z � N N %: 50% 68% 80% 90% 95% 98% 99% 0.4 z : 0.67 1.00 1.28 1.64 1.96 2.33 2.58 N 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -3 -2 -1 0 1 2 3 85 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Con�dence In terv als, More Correctly If � S con tains n examples, dra wn indep enden tly of h and eac h other � n � 30 Then � With appro ximately 95% probabilit y , er r or ( h ) S lies in in terv al v u u u er r or ( h )(1 � er r or ( h )) D D u t er r or ( h ) � 1 : 96 D n equiv alen tl y , er r or ( h ) lies in in terv al D v u u u er r or ( h )(1 � er r or ( h )) D D u t er r or ( h ) � 1 : 96 S n whic h is appro ximately v u u u er r or ( h )(1 � er r or ( h )) S S u t er r or ( h ) � 1 : 96 S n 86 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Cen tral Limit Theorem Consider a set of indep enden t, iden ticall y distributed random v ariables Y : : : Y , all go v erned 1 n b y an arbitrary probabilit y distribution with mean 2 � and �nite v ariance � . De�ne the sample mean, n 1 X � Y � Y i n i =1 Cen tral Limit Theorem. As n ! 1 , the � distribution go v erning Y approac hes a Normal 2 � distribution, with mean � and v ariance . n 87 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Calculating Con�dence In terv als 1. Pic k parameter p to estimate � er r or ( h ) D 2. Cho ose an estimator � er r or ( h ) S 3. Determine probabilit y distribution that go v erns estimator � er r or ( h ) go v erned b y Binomial distribution, S appro ximated b y Normal when n � 30 4. Find in terv al ( L; U ) suc h that N% of probabilit y mass falls in the in terv al � Use table of z v alues N 88 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Di�erence Bet w een Hyp otheses T est h on sample S , test h on S 1 1 2 2 1. Pic k parameter to estimate d � er r or ( h ) � er r or ( h ) D 1 D 2 2. Cho ose an estimator ^ d � er r or ( h ) � er r or ( h ) S 1 S 2 1 2 3. Determine probabilit y distribution that go v erns estimator s error (h )(1 � error (h )) error (h )(1 � error (h )) S 1 S 1 S 2 S 2 1 1 2 2 � � + ^ d n n 1 2 4. Find in terv al ( L; U ) suc h that N% of probabilit y mass falls in the in terv al v u u u er r or ( h )(1 � er r or ( h )) er r or ( h )(1 � er r or ( h )) S 1 S 1 S 2 S 2 u 1 1 2 2 ^ u d � z + t N n n 1 2 89 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997
Recommend
More recommend