Decision theory Dr. Jarad Niemi STAT 544 - Iowa State University - PowerPoint PPT Presentation

Decision theory Dr. Jarad Niemi STAT 544 - Iowa State University March 7, 2017 Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 1 / 13

Bayesian statistician Definition A Bayesian statistician is an individual who makes decisions based on the probability distribution of those things we don’t know conditional on what we know, i.e. p ( θ | y, K ) . Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 2 / 13

Bayesian decision theory Bayesian decision theory Suppose we have an unknown quantity θ which we believe follows a probability distribution p ( θ ) and a decision (or action) δ . For each decision, we have a loss function L ( θ, δ ) that describes how much we lose if θ is the truth. The expected loss is taken with respect to θ ∼ p ( θ ) , i.e. � E θ [ L ( θ, δ )] = L ( θ, δ ) p ( θ ) dθ = f ( δ ) . The optimal Bayesian decision is to choose δ that minimizes the expected loss, i.e. δ opt = argmin δ E [ L ( θ, δ )] = argmin δ f ( δ ) . Economists typically maximize expected utility where utility is the negative of loss, i.e. U ( θ, δ ) = − L ( θ, δ ) . If we have data, just replace the prior p ( θ ) with the posterior p ( θ | y ) . Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 3 / 13

Bayesian decision theory Depicting loss/utility functions 4 3 Decision d_1 Loss 2 d_2 d_3 1 0 −2 −1 0 1 2 theta Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 4 / 13

Bayesian decision theory Parameter estimation Parameter estimation Definition For a given loss function L ( θ, ˆ θ ) where ˆ θ is an estimator for θ , the Bayes estimator is the function ˆ θ that minimizes the expected loss, i.e. �� ˆ θ, ˆ θ = argmin ˆ θ E θ | y L θ � y . � Recall that θ = E [ θ | y ] minimizes L ( θ, ˆ ˆ θ ) = ( θ − ˆ θ ) 2 � ˆ −∞ p ( θ | y ) dθ minimizes L ( θ, ˆ θ θ ) = | θ − ˆ 0 . 5 = θ | ˆ θ = argmax θ p ( θ | y ) is found as the minimizer of the sequence of loss functions L ( θ, ˆ θ ) = − I( | θ − ˆ θ | < ǫ ) as ǫ → 0 Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 5 / 13

Bayesian decision theory Choosing a hand Which hand? The setup: Randomly put a quarter in one of two hands with probability p . Let θ ∈ { 0 , 1 } indicate that the quarter is in the right hand. You get to choose whether the quarter is in the right hand or not. If you guess the quarter is in the right hand and it is, you get to keep the quarter. Otherwise, you don’t get anything. We have θ ∼ Ber ( p ) and two actions a 0 : say the quarter is not in the right hand and a 1 : say the quarter is in the right hand. Thus, the utility is � $0 . 25 θ if a 1 U ( θ, a i ) = 0 if a 0 and the expected utility is � $0 . 25 p if a 1 E [ U ( θ, a i )] = 0 if a 0 So, we maximize expected utility by taking a 1 if p > 0 . Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 6 / 13

Bayesian decision theory Choosing a hand How many quarters in the jar? Suppose a jar is filled up to a pre-specified line. Let θ be the number of quarters in the jar. Provide a probability distribution for your uncertainty in θ . Suppose you choose θ ∼ N ( µ, σ 2 ) Since θ ∈ N + , we can provide a formal prior by letting P ( θ = q ) ∝ N ( q ; µ, σ 2 )I(0 < q ≤ U ) for some upper bound U . Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 7 / 13

Bayesian decision theory Choosing a hand Guessing how many quarters are in the jar. Now you are asked to guess how many quarters are in the jar. What should you guess? Let q be the guess that the number of quarters is q , then our utility is U ( θ, q ) = q I( θ = q ) and our expected utility is E θ [ U ( θ, q )] = qP ( θ = q ) ∝ qN ( q ; µ, σ 2 )I(0 ≤ q ≤ U ) . Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 8 / 13

Bayesian decision theory Choosing a hand Deriving the optimal decision Here are three approaches for deriving the optimal decision: f ( q ) = qN ( q ; µ, σ 2 )I(0 ≤ q ≤ U ) argmax q f ( q ) , 1. Evaluate f ( q ) for q ∈ { 1 , 2 , . . . , U } and find which one is the maximum. 2. Treat q as continuous and use a numerical optimization routine. 3. Take the derivative of f ( q ) , set it equal to zero, and solve for q . In all cases, you are better off taking the log f ( q ) which is monotonic and therefore will still provide the same maximum as f ( q ) . Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 9 / 13

Bayesian decision theory Choosing a hand Visualizing the expected log utility # p(theta) \ propto N(theta;mu,sigma^2)I(1<= theta <= 400) mu=160; sigma=60; U=400 0.006 fxn 0.004 value expected_utility probability_mass_function 0.002 0.000 0 100 200 300 400 theta Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 10 / 13

Bayesian decision theory Choosing a hand Computational approaches log_f = Vectorize(function(q, mu, sigma, U) { if (q<0 | q>U) return(-Inf) return(log(q) + dnorm(q, mu, sigma, log=TRUE)) } ) # Evaluate all options log_expected_utility = log_f(1:U, mu=mu, sigma=sigma, U=U) which.max(log_expected_utility) # since we are using integers 1:U [1] 180 # Numerical optimization optimize(function(x) log_f(x, mu=mu, sigma=sigma, U=U), c(1,U), maximum=TRUE) $maximum [1] 180 $objective [1] 0.1241182 Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 11 / 13

Bayesian decision theory Choosing a hand Derivation The function to maximize is log f ( q ) = log( q ) − ( q − µ ) 2 / 2 σ 2 . The derivative is dq log f ( q ) = 1 d q − ( q − µ ) /σ 2 . Setting this equal to zero and multiplying by − qσ 2 results in q 2 − µq − σ 2 = 0 . This is a quadratic with roots at µ 2 + 4 σ 2 � µ ± . 2 Since q must be positive, the answer is (mu+sqrt(mu^2+4*sigma^2))/2 [1] 180 Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 12 / 13

Bayesian decision theory Sequential decisions Sequential decisions Consider a sequence of posteriors distributions p ( θ t | y 1: t ) that describe your uncertainty about the current state of the world θ t given the data up to the current time y 1: t = ( y 1 , . . . , y t ) . You also have a loss function for the current time L ( θ t , δ t ) . No suppose you are allowed to make a decision δ t +1 at each time t and this decision can affect the future states of the world θ s for s > t . At each time point, we have an optimal Bayes decision, i.e. ∞ � argmin δ t +1 E θ s ,δ s | y 1: t [ L ( θ s , δ s ) | y 1: t ] . s = t +1 But because your decision can affect future states which, in turn, can affect future decisions, your current decision needs to integrate over future decisions. Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 13 / 13

Decision theory Dr. Jarad Niemi STAT 544 - Iowa State University - PowerPoint PPT Presentation

Decision theory Dr. Jarad Niemi STAT 544 - Iowa State University March 7, 2017 Jarad Niemi (STAT544@ISU) Decision theory March 7, 2017 1 / 13 Bayesian statistician Definition A Bayesian statistician is an individual who makes decisions

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

6 Decision- -Making Making MVC (revisited) 6 Decision MVC (revisited) decision

S C DECISION E N C E decision science SDS CMU What is Decision Science? Behavioral

Decision Tree Decision Trees A decision tree is a decision support tool that uses a tree-like

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

A Decision A Decision A Decision-Analytic Approach for A Decision Analytic Approach for

Decision Trees Lecture 22 To left or to right 1 Decision Trees 2 Decision Trees A different

Decision Theory an analytic and systematic approach to the study of decision making

Decision Theory Philipp Koehn 5 November 2015 Philipp Koehn Artificial Intelligence: Decision

Decision Theory Philipp Koehn 9 April 2019 Philipp Koehn Artificial Intelligence: Decision

Decision Making 1 Decision Making Skills Establishing a positive decision-making environment.

DECISION MAKING readysetpresent.com Decision Making Program Objectives ( 1 of 2 ) To examine

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

What is game theory? Study of interacting decision makers emphasis on cold-blooded,

THE QUANTUM CHALLENGE IN COGNITIVE SCIENCE AND DECISION THEORY SCIENCE AND DECISION THEORY

Lecture 3: Bayesian Decision Theory Dr. Chengjiang Long Computer Vision Researcher at Kitware

Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University

Bayesian Decision Theory with applications to Experimental Design Robbie Peck University of Bath

De Decision cision Th Theo eory: ry: Se Sequ quential ential De Decisions cisions Co

Statistical Machine Learning Lecture 05: Bayesian Decision Theory Kristian Kersting TU Darmstadt

CS 7616 Pattern Recognition Bayesian Decision Theory Aaron Bobick School of Interactive Computing

Making Decisions Under Uncertainty What an agent should do depends on: The agents ability

Probability and Statistical Decision Theory Many slides attributable to: Prof. Mike Hughes Erik