ECE 4524 Artificial Intelligence and Engineering Applications - PowerPoint PPT Presentation

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 22: Introduction to Learning Reading: AIAMA 18.1-18.3 Today’s Schedule: ◮ Motivation for Learning ◮ Types of Learning ◮ Supervised Learning and Hypothesis spaces ◮ Example: Decision Trees

Why learning? ◮ not all information is known at design time ◮ it might be impractical to program all possibilities directly ◮ some agents need to be able to adapt over time ◮ we might not know how to solve a problem directly by design This area in general is referred to as Machine Learning .

Learning is a very general concept. It can be applied to all elements of an agents design, e.g. we might ◮ learn functions mapping percepts to internal states ◮ learn functions mapping states to actions ◮ learn the agent model itself ◮ learn probabilities ◮ learn utilities of internal states or actions Any agent component with a representation, prior knowledge of the representation, and a way to update the representation using feedback can use learning methods.

Categorization of Learning The most basic distinction in learning is the difference between ◮ Deductive Learning ◮ Inductive Learning Within inductive learning there is ◮ unsupervised learning ◮ reinforcement learning ◮ supervised learning

Supervised Learning Supervised learning is conceptually very simple, but has many practical and subtle issues. ◮ Given a training set consisting of examples D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , · · · , ( x n , y n ) } where each example obeys y i = f ( x i ) for some unknown function f ( · ). ◮ Find a function, the hypothesis h ( · ) y = h ( x ) that approximates the true f .

The quality of the approximation is measured using the Test Set . T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , · · · , ( x m , y m ) } where m < n and T ∩ D = ∅ ◮ Collecting training and testing sets is often hard and expensive ◮ a h that performs well on the test set is said to generalize well. ◮ an h that performs well on the training set (said to be consistent) but poorly on the test set is said to be over-trained . Note the test set is independent of the training set!

Some Nomenclature ◮ When y is finite with a categorical interpretation, this is a classification problem ◮ If y is binary it is a binary classification problem ◮ If y is continuous then it is a regression problem.

Hypothesis Space In y = h ( x ), h is a hypothesis in some space of functions H . ◮ Goal is to find a consistent h with smallest testing error and the simplest representation (Ockham’s Razor) ◮ If we restrict the space H then it may be that no h can be found which approximates f sufficiently (unrealizable). ◮ The complexity/expressiveness of H and the generalization of h ∈ H is related through the bias-variance dilemma .

Bayesian analysis gives us a useful framework for supervised learning ◮ Let h ∈ H be parameterized by θ , and the training data given by D , then the posterior of the parameters is p ( θ | D , h ) = p ( D | θ, h ) p ( θ | h ) P ( D | h ) ◮ The posterior of the model is the evidence for h p ( h | D ) = p ( D | h ) p ( h ) P ( D ) where the denominator integrates over all models in H

Bayesian analysis gives us a useful framework for supervised learning ◮ The maximum likelihood model ignores the prior over models argmax P ( D | h ) h and is the model with the most evidence. ◮ The maximum a-posteriori (MAP) model includes the prior over models p ( D | h ) p ( h ) argmax p ( h | D ) = argmax () h h where the denominator P ( D ) is common to all models and so irrelevant to the model selection. We can also average models by choosing the top models rather than a single model. This is particularly useful in binary classification, where the models can simply vote on the final classifier output.

Utility of models ◮ We assume the true f(x) is stationary and samples are IID. ◮ The error rate is the proportion of incorrect classifications. ◮ Note the error rate may be misleading since it makes no distinction about utility differences. Example: Binary classifier has 4 cases: TP, FP, TN, FN ◮ The cost of a FP or TN may not be the same. ◮ This is accounted for via a utility/loss function.

Sources of Model Error ◮ The estimated h may differ from the true f because 1. the space H is overly restrictive (unrealizable) 2. the variance is large (high degrees of freedom) 3. f itself may be non-deterministic (noisy) 4. f is ”too complex” ◮ Most of Machine Learning has been focused on 1 and 2. ◮ A large open area in machine learning now is 4, ”learning in the large” (e.g. neuroscience, bioinformatics, sociology, networks)

An example learning method: Decision Trees Consider a simple reflex agent that reasons by testing a series of attribute = value pairs. ◮ Let x be a vector of attributes ◮ Let y be a +/- or 0/1 assignment for a Goal (a binary classifier) ◮ Given D = ( x i , y i ) for i = 1 · · · N build the tree of decisions formed by testing the attributes of x individually.

Implementing the importance function The idea is that we want to select the attribute that maximizes our ”surprise” ◮ The entropy of a R.V. V with values v k measures it’s uncertainty � H ( V ) = − p ( v k ) log 2 ( p ( v k )) in bits k ◮ For a Boolean R.V. with probability of true = q the Entropy is B ( q ) = − ( q log 2 q + (1 − q ) log 2 (1 − q )) where q ≈ p / ( p + n ) for p positive and n negative samples.

Implementing the importance function Now suppose we choose attribute A from x ◮ For each possible value of A we divide the training set into k subsets with p k positive and n k negative examples ◮ After testing A , the remaining entropy is d p k + n k � � p k � remainder( A ) = p + n B p k + n k k =1 ◮ The information gain associated with selecting A is then � � p gain( A ) = B − remainder( A ) p + n We choose the attribute with the highest gain in information.

Next Actions ◮ Reading on Learning Theory (AIAMA 18.4-18.5) ◮ No warmup. Reminders: ◮ Quiz 3 will be Thurday 4/12. ◮ PS 3 is due tonight.

ECE 4524 Artificial Intelligence and Engineering Applications - PowerPoint PPT Presentation

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 22: Introduction to Learning Reading: AIAMA 18.1-18.3 Todays Schedule: Motivation for Learning Types of Learning Supervised Learning and Hypothesis spaces

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

ECE 4524 Artificial Intelligence and Engineering Applications Tree and Graph Search Reading:

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 16: Uncertainty and

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 5: Two-Player Games and

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 6: Alpha-Beta Pruning,

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 23: Learning Theory

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 21: Decisions and Utility

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 20: Approximate Inference

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 10: Theorem Proving in

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 9: Knowledge-Based Agents

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 12: Unification in FOL

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 8: Searching for Constraint

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 4: Heuristic Search

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 19: Bayesian Networks

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 7: Constraint Satisfaction

Reinforcement Learning by the People and for the People: With a Focus on Lifelong / Meta /

Focused emulation of modal proof systems Sonia Marin with Dale Miller and Marco Volpe Inria,

HPSG approaches to information structure A basic HPSG approach (Engdahl & Vallduv

Hoare Logic and Model Checking It is clear that proofs can be long and boring even if programs

Opportunity in the cities of the developing world Motivation 1 There is a rich literature on

Co lle g e o f Die titia ns o f Alb e rta Co ntinuing Co mpe te nc e Pro g ra m Upda te Sha

Orientation Workshop Sem 2, 2017 Helping you transition successfully to University 1. Orientation

MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania Machlab Lukas Burger Michael

ECE 4524 Artificial Intelligence and Engineering Applications - PowerPoint PPT Presentation

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 22: Introduction to Learning Reading: AIAMA 18.1-18.3 Todays Schedule: Motivation for Learning Types of Learning Supervised Learning and Hypothesis spaces

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

ECE 4524 Artificial Intelligence and Engineering Applications Tree and Graph Search Reading:

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 16: Uncertainty and

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 5: Two-Player Games and

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 6: Alpha-Beta Pruning,

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 23: Learning Theory

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 21: Decisions and Utility

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 20: Approximate Inference

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 10: Theorem Proving in

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 9: Knowledge-Based Agents

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 12: Unification in FOL

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 8: Searching for Constraint

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 4: Heuristic Search

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 19: Bayesian Networks

ECE 4524 Artificial Intelligence and Engineering Applications Meeting 7: Constraint Satisfaction

Reinforcement Learning by the People and for the People: With a Focus on Lifelong / Meta /

Focused emulation of modal proof systems Sonia Marin with Dale Miller and Marco Volpe Inria,

HPSG approaches to information structure A basic HPSG approach (Engdahl &amp; Vallduv

Hoare Logic and Model Checking It is clear that proofs can be long and boring even if programs

Opportunity in the cities of the developing world Motivation 1 There is a rich literature on

Co lle g e o f Die titia ns o f Alb e rta Co ntinuing Co mpe te nc e Pro g ra m Upda te Sha

Orientation Workshop Sem 2, 2017 Helping you transition successfully to University 1. Orientation

MOtif aNAlysis with Lisa European Bioconductor Meeting 2019 Dania Machlab Lukas Burger Michael

HPSG approaches to information structure A basic HPSG approach (Engdahl & Vallduv