computational learning theory positive and negative
play

Computational Learning Theory: Positive and negative learnability - PowerPoint PPT Presentation

Computational Learning Theory: Positive and negative learnability results Machine Learning 1 Slides based on material from Dan Roth, Avrim Blum, Tom Mitchell and others Computational Learning Theory The Theory of Generalization Probably


  1. Computational Learning Theory: Positive and negative learnability results Machine Learning 1 Slides based on material from Dan Roth, Avrim Blum, Tom Mitchell and others

  2. Computational Learning Theory • The Theory of Generalization • Probably Approximately Correct (PAC) learning • Positive and negative learnability results • Agnostic Learning • Shattering and the VC dimension 2

  3. This lecture: Computational Learning Theory • The Theory of Generalization • Probably Approximately Correct (PAC) learning • Positive and negative learnability results • Agnostic Learning • Shattering and the VC dimension 3

  4. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned General conjunctions are PAC learnable – 𝐼 = Number of conjunctions of 𝑜 variables = 3 ! ln 𝐼 = 𝑜 ln(3) – Number of examples needed 𝑛 > " # 𝑜 ln 3 + ln " $ 4

  5. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned General conjunctions are PAC learnable – 𝐼 = Number of conjunctions of 𝑜 variables = 3 ! ln 𝐼 = 𝑜 ln(3) – Number of examples needed 𝑛 > " # 𝑜 ln 3 + ln " $ If we want to guarantee a 95% chance of learning a hypothesis of at least 90% • ! !" "."$ #$% !" & accuracy, with 𝑜 = 10 Boolean variables, we need m > = 140 %.$ examples If 𝑜 = 100 , this goes to 1129. (linearly increases with n) • Increasing the confidence to 99% will cost 1145 examples (logarithmic with 𝜀 ) • These results hold for any consistent learner 5

  6. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned General conjunctions are PAC learnable – 𝐼 = Number of conjunctions of 𝑜 variables = 3 ! ln 𝐼 = 𝑜 ln(3) – Number of examples needed 𝑛 > " # 𝑜 ln 3 + ln " $ If we want to guarantee a 95% chance of learning a hypothesis of at least 90% • ! !" "."$ #$% !" & accuracy, with 𝑜 = 10 Boolean variables, we need m > = 140 %.$ examples If 𝑜 = 100 , this goes to 1129. (linearly increases with n) • Increasing the confidence to 99% will cost 1145 examples (logarithmic with 𝜀 ) • These results hold for any consistent learner 6

  7. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned General conjunctions are PAC learnable – 𝐼 = Number of conjunctions of 𝑜 variables = 3 ! ln 𝐼 = 𝑜 ln(3) – Number of examples needed 𝑛 > " # 𝑜 ln 3 + ln " $ If we want to guarantee a 95% chance of learning a hypothesis of at least 90% • ! !" "."$ #$% !" & accuracy, with 𝑜 = 10 Boolean variables, we need m > = 140 %.$ examples If 𝑜 = 100 , this goes to 1129. (linearly increases with n) • Increasing the confidence to 99% will cost 1145 examples (logarithmic with 𝜀 ) • These results hold for any consistent learner 7

  8. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned General conjunctions are PAC learnable – 𝐼 = Number of conjunctions of 𝑜 variables = 3 ! ln 𝐼 = 𝑜 ln(3) – Number of examples needed 𝑛 > " # 𝑜 ln 3 + ln " $ If we want to guarantee a 95% chance of learning a hypothesis of at least 90% • ! !" "."$ #$% !" & accuracy, with 𝑜 = 10 Boolean variables, we need m > = 140 %.$ examples If 𝑜 = 100 , this goes to 1129. (linearly increases with n) • Increasing the confidence to 99% will cost 1145 examples (logarithmic with 𝜀 ) • These results hold for any consistent learner 8

  9. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) 9

  10. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) 10

  11. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? Number of conjuncts = • A 3-CNF is a conjunction with these many variables. • |H| = Number of 3-CNFs = • log(|H|) = O(n 3 ) • log(|H|) is polynomial in n ) the sample complexity is also polynomial in n 11

  12. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? 2𝑜 ! Number of conjuncts = 𝑃 • • A 3-CNF is a conjunction with these many variables. |𝐼| = Number of 3-CNFs = 𝑃 2 "# ( • log |𝐼| = 𝑃(𝑜 ! ) • log(|H|) is polynomial in n ) the sample complexity is also polynomial in n 12

  13. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? 2𝑜 ! Number of conjuncts = 𝑃 • • A 3-CNF is a conjunction with these many variables. |𝐼| = Number of 3-CNFs = 𝑃 2 "# ( • log |𝐼| = 𝑃(𝑜 ! ) • log(|H|) is polynomial in n ) the sample complexity is also polynomial in n 13

  14. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? 2𝑜 ! Number of conjuncts = 𝑃 • • A 3-CNF is a conjunction with these many variables. |𝐼| = Number of 3-CNFs = 𝑃 2 "# ( • log |𝐼| = 𝑃(𝑜 ! ) • log(|H|) is polynomial in n ) the sample complexity is also polynomial in n 14

  15. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? 2𝑜 ! Number of conjuncts = 𝑃 • • A 3-CNF is a conjunction with these many variables. |𝐼| = Number of 3-CNFs = 𝑃 2 "# ( • log |𝐼| = 𝑃(𝑜 ! ) • log(|H|) is polynomial in n ) the sample complexity is also polynomial in n 15

  16. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? 2𝑜 ! Number of conjuncts = 𝑃 • • A 3-CNF is a conjunction with these many variables. |𝐼| = Number of 3-CNFs = 𝑃 2 "# ( • log |𝐼| = 𝑃(𝑜 ! ) • log(|H|) is polynomial in n ) the sample complexity is also polynomial in n 16

  17. 𝑛 > 1 + ln 1 𝜗 ln 𝐼 𝜀 What can be learned 𝑚 "" ∨ 𝑚 "6 ∨ 𝑚 "7 ∧ 𝑚 6" ∨ 𝑚 66 ∨ 𝑚 67 3-CNF Subset of CNFs: Each conjunct can have at most three literals (i.e a variable or its negation) What is the sample complexity? That is, if we had a consistent learner, how many examples will it need to guarantee PAC learnability? We need the size of the hypothesis space. How many 3CNFs are there? 2𝑜 ! Number of conjuncts = 𝑃 • • A 3-CNF is a conjunction with these many variables. |𝐼| = Number of 3-CNFs = 𝑃 2 "# ( • log |𝐼| = 𝑃(𝑜 ! ) • 17

Recommend


More recommend