Ev aluating Hyp otheses [Read Ch. 5] [Recommended exercises: - PDF document

Ev aluating Hyp otheses [Read Ch. 5] [Recommended exercises: 5.2, 5.3, 5.4] � Sample error, true error � Con�dence in terv als for observ ed h yp othesis error � Estimators � Binomial distribution, Normal distribution, Cen tral Limit Theorem � P aired t tests � Comparing learning metho ds 74 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Tw o De�nitions of Error The true error of h yp othesis h with resp ect to target function f and distribution D is the probabilit y that h will misclassify an instance dra wn at random according to D . er r or ( h ) � Pr [ f ( x ) 6 = h ( x )] D x 2D The sample error of h with resp ect to target function f and data sample S is the prop ortion of examples h misclassi�es 1 X er r or ( h ) � � ( f ( x ) 6 = h ( x )) S n x 2 S Where � ( f ( x ) 6 = h ( x )) is 1 if f ( x ) 6 = h ( x ), and 0 otherwise. Ho w w ell do es er r or ( h ) estimate er r or ( h )? S D 75 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Problems Estimating Error 1. Bias: If S is training set, er r or ( h ) is S optimisticall y biased bias � E [ er r or ( h )] � er r or ( h ) S D F or un biased estimate, h and S m ust b e c hosen indep enden tly 2. V arianc e: Ev en with un biased S , er r or ( h ) ma y S still vary from er r or ( h ) D 76 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Example Hyp othesis h misclassi�es 12 of the 40 examples in S 12 er r or ( h ) = = : 30 S 40 What is er r or ( h )? D 77 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Estimators Exp erimen t: 1. c ho ose sample S of size n according to distribution D 2. measure er r or ( h ) S er r or ( h ) is a random v ariable (i.e., result of an S exp erimen t) er r or ( h ) is an un biased estimator for er r or ( h ) S D Giv en observ ed er r or ( h ) what can w e conclude S ab out er r or ( h )? D 78 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Con�dence In terv als If � S con tains n examples, dra wn indep enden tly of h and eac h other � n � 30 Then � With appro ximately 95% probabilit y , er r or ( h ) D lies in in terv al v u u u er r or ( h )(1 � er r or ( h )) S S u t er r or ( h ) � 1 : 96 S n 79 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Con�dence In terv als If � S con tains n examples, dra wn indep enden tly of h and eac h other � n � 30 Then � With appro ximately N% probabilit y , er r or ( h ) D lies in in terv al v u u u er r or ( h )(1 � er r or ( h )) S S u t er r or ( h ) � z S N n where N %: 50% 68% 80% 90% 95% 98% 99% z : 0.67 1.00 1.28 1.64 1.96 2.33 2.58 N 80 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

er r or ( h ) is a Random V ariable S Rerun the exp erimen t with di�eren t randomly dra wn S (of size n ) Probabilit y of observing r misclassi�ed examples: n ! r n � r P ( r ) = er r or ( h ) (1 � er r or ( h )) D D r !( n � r )! Binomial distribution for n = 40, p = 0.3 0.14 0.12 0.1 0.08 P(r) 0.06 0.04 0.02 0 0 5 10 15 20 25 30 35 40 81 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Binomial Probabilit y Distributi on n ! r n � r P ( r ) = p (1 � p ) r !( n � r )! Probabilit y P ( r ) of r heads in n coin �ips, if Binomial distribution for n = 40, p = 0.3 0.14 p = Pr ( heads ) 0.12 0.1 � Exp ected, or mean v alue of X , E [ X ], is 0.08 P(r) n X 0.06 E [ X ] � iP ( i ) = np i =0 0.04 0.02 � V ariance of X is 0 2 0 5 10 15 20 25 30 35 40 V ar ( X ) � E [( X � E [ X ]) ] = np (1 � p ) � Standard deviation of X , � , is X r r 2 � � E [( X � E [ X ]) ] = np (1 � p ) X 82 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Normal Distributi on Appro ximates Bino- mial er r or ( h ) follo ws a Binomial distribution, with S � mean � = er r or ( h ) er r or ( h ) D S � standard deviation � er r or ( h ) S v u u u er r or ( h )(1 � er r or ( h )) D D u t � = er r or ( h ) S n Appro ximate this b y a Normal distribution with � mean � = er r or ( h ) er r or ( h ) D S � standard deviation � er r or ( h ) S v u u u er r or ( h )(1 � er r or ( h )) S S u t � � er r or ( h ) S n 83 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Normal Probabilit y Distributi on 1 x � � 1 2 � ( ) 2 � p p ( x ) = e 2 2 � � The probabilit y that X will fall in to the in terv al ( a; b ) is giv en b y Normal distribution with mean 0, standard deviation 1 Z 0.4 b p ( x ) dx 0.35 a 0.3 � 0.25 Exp ected, or mean v alue of X , E [ X ], is 0.2 0.15 E [ X ] = � 0.1 0.05 � V ariance of X is 0 -3 -2 -1 0 1 2 3 2 V ar ( X ) = � � Standard deviation of X , � , is X � = � X 84 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Normal Probabilit y Distributi on 80% of area (probabilit y) lies in � � 1 : 28 � N% of area (probabilit y) lies in � � z � N N %: 50% 68% 80% 90% 95% 98% 99% 0.4 z : 0.67 1.00 1.28 1.64 1.96 2.33 2.58 N 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -3 -2 -1 0 1 2 3 85 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Con�dence In terv als, More Correctly If � S con tains n examples, dra wn indep enden tly of h and eac h other � n � 30 Then � With appro ximately 95% probabilit y , er r or ( h ) S lies in in terv al v u u u er r or ( h )(1 � er r or ( h )) D D u t er r or ( h ) � 1 : 96 D n equiv alen tl y , er r or ( h ) lies in in terv al D v u u u er r or ( h )(1 � er r or ( h )) D D u t er r or ( h ) � 1 : 96 S n whic h is appro ximately v u u u er r or ( h )(1 � er r or ( h )) S S u t er r or ( h ) � 1 : 96 S n 86 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Cen tral Limit Theorem Consider a set of indep enden t, iden ticall y distributed random v ariables Y : : : Y , all go v erned 1 n b y an arbitrary probabilit y distribution with mean 2 � and �nite v ariance � . De�ne the sample mean, n 1 X � Y � Y i n i =1 Cen tral Limit Theorem. As n ! 1 , the � distribution go v erning Y approac hes a Normal 2 � distribution, with mean � and v ariance . n 87 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Calculating Con�dence In terv als 1. Pic k parameter p to estimate � er r or ( h ) D 2. Cho ose an estimator � er r or ( h ) S 3. Determine probabilit y distribution that go v erns estimator � er r or ( h ) go v erned b y Binomial distribution, S appro ximated b y Normal when n � 30 4. Find in terv al ( L; U ) suc h that N% of probabilit y mass falls in the in terv al � Use table of z v alues N 88 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Di�erence Bet w een Hyp otheses T est h on sample S , test h on S 1 1 2 2 1. Pic k parameter to estimate d � er r or ( h ) � er r or ( h ) D 1 D 2 2. Cho ose an estimator ^ d � er r or ( h ) � er r or ( h ) S 1 S 2 1 2 3. Determine probabilit y distribution that go v erns estimator s error (h )(1 � error (h )) error (h )(1 � error (h )) S 1 S 1 S 2 S 2 1 1 2 2 � � + ^ d n n 1 2 4. Find in terv al ( L; U ) suc h that N% of probabilit y mass falls in the in terv al v u u u er r or ( h )(1 � er r or ( h )) er r or ( h )(1 � er r or ( h )) S 1 S 1 S 2 S 2 u 1 1 2 2 ^ u d � z + t N n n 1 2 89 lecture slides for textb o ok Machine L e arning , T. Mitc hell, McGra w Hill, 1997

Ev aluating Hyp otheses [Read Ch. 5] [Recommended exercises: - PDF document

Ev aluating Hyp otheses [Read Ch. 5] [Recommended exercises: 5.2, 5.3, 5.4] Sample error, true error Condence in terv als for observ ed h yp othesis error Estimators Binomial distribution, Normal

Ev aluating Hyp otheses Read Ch Recommended exercises

Lesson 1.3 Rt. Trigonometry (continued) Sin = opp/hyp Cos = adj/hyp Opposite Side

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Exercises Recommended trials Exercises 1-8 Taisuke Ozaki (ISSP, Univ. of Tokyo) The

Ba y esian Learning [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.6] Ba y es

Ba y esian Learning Read Ch Suggested exercises

Outline read Chapter suggested exercises

Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from

Non-hyp is a spectrum. Antonio Montalb an. U. of Chicago (with Noam Greenberg and Theodore A.

Articial Neural Net w orks [Read Ch. 4] [Recommended exercises 4.1, 4.2, 4.5, 4.9,

Learning Sets of Rules [Read Ch. 10] [Recommended exercises 10.1, 10.2, 10.5, 10.7,

EXERCISES EXERCISES Important Perfectly safe for the vast majority of people Those with

Neck Exercises for Prevention, Neck Exercises for Prevention, Rehabilitation and Strength

Course setup 9 ec course examination based on computer exercises weekly exercises

Exercises, II part Forward Chaining: 12 Jul 2012 Exercises, II part Consider the following set

Evaluati aluating ng th the e effects ects of a c community mmunity media dia approach:

CMPSCI 711: More Advanced Algorithms Section 2-1: Graph Streams Andrew McGregor Last Compiled:

Green Simulation Optimization Using Likelihood Ratio Estimators David J. Eckman M. Ben Feng

EI331 Signals and Systems Lecture 28 Bo Jiang John Hopcroft Center for Computer Science

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

The production of additional bosons and the impact on the Large Hadron Collider presented by Alan

Search for H + and H ++ bosons with the CMS detector Nuno Almeida LIP Lisbon (On behalf of

4.2: Isomorphism of Grammars In this section, we study grammar isomorphism, i.e., the way in

Loss factorization, weakly supervised learning and label noise robustness Giorgio Patrini,