MAL - Hypothesis Testing Rob Franken Dept. of Information and Computing Sciences, Utrecht University P.O. Box 80.089, 3508 TB Utrecht, The Netherlands Web pages: http://www.cs.uu.nl/ 7 April 2011
Paper Plan for this half of the lecture is to prove the main result of the paper by Foster and Young Literature D.P. Foster, H.P. Young (2003): “Learning, hypothesis testing, and Nash equilibrium” in Games and Economic Behavior Elsevier
Main Theorem Theorem 1 Suppose that the players adopt hypotheses with finite memory, have σ i -smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0 , if the σ i are small (given ǫ ), if the test tolerances τ i are sufficiently fine (given ǫ and σ i ) and if the amounts of data collected, s i , are sufficiently large (given ǫ , σ i and τ ) then: 1. The repeated-game strategies are ǫ -equilibria of the repeated game G ∞ ( � u , X ) at least 1 − ǫ of the time. 2. All players for whom prediction matters by at least are ǫ -good predictors.
Introductory Definitions A i set of responses for player i with memory m
Introductory Definitions A i set of responses for player i with memory m B i set of models player i can hold over players j � = i
Introductory Definitions A i set of responses for player i with memory m B i set of models player i can hold over players j � = i σ a function from B to A , mapping all players A � believe to their responses.
Introductory Definitions A i set of responses for player i with memory m B i set of models player i can hold over players j � = i σ a function from B to A , mapping all players A � believe to their responses. B a function from A to B mapping all players current responses to the correct models.
Fixed Points ◮ We can easily think of what a fixed point model or response is, they even correspond to each other.
Fixed Points ◮ We can easily think of what a fixed point model or response is, they even correspond to each other. �� ? � σ � � = � ◮ B A � b b σ ( B ( � a )) ? ◮ A � = � a
Fixed Points ◮ We can easily think of what a fixed point model or response is, they even correspond to each other. �� ? � σ � � = � ◮ B A � b b σ ( B ( � a )) ? ◮ A � = � a ? � ��� � σ � � A � � b i − B i � τ b ◮ � � �
Fixed Points ◮ We can easily think of what a fixed point model or response is, they even correspond to each other. �� ? � σ � � = � ◮ B A � b b σ ( B ( � a )) ? ◮ A � = � a ? � ��� � σ � � A � � b i − B i � τ b ◮ � � � ◮ Fixed points are equilibria
Overview of the Proof ◮ Suppose the current � b is bad for some responsive player i .
Overview of the Proof ◮ Suppose the current � b is bad for some responsive player i . ◮ With high probability he will reject his hypothesis.
Overview of the Proof ◮ Suppose the current � b is bad for some responsive player i . ◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to reject their own model hypothesises.
Overview of the Proof ◮ Suppose the current � b is bad for some responsive player i . ◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to reject their own model hypothesises. ◮ Positive chance that this goes to a single fixed point model b ∗ except for the player i , but this will be corrected after � he tests another time.
Overview of the Proof ◮ Suppose the current � b is bad for some responsive player i . ◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to reject their own model hypothesises. ◮ Positive chance that this goes to a single fixed point model b ∗ except for the player i , but this will be corrected after � he tests another time. ◮ If in a fixed point model chance to get out of it are small
Lemma Lemma Fix a finite action space X = � n i =1 X i . Given any ǫ > 0 , and any finite memory m , there exists functions σ ( ǫ ) , τ ( ǫ, σ ) , s ( ǫ, σ, τ ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t , � a t − A � � a t ��� σ � B 1. � � � � ǫ/ 2 . � � a t �� − max a ′ � a t ��� � U t � a t i U t 2. i , B i � � � a ′ i , B i � � � � ǫ for all i � � i i � b t ��� � � � � b t A σ i 3. i − B i � � ǫ for every player i for whom � � prediction matter by at least ǫ .
Proof of Lemma (1) a ∗ and � ◮ Pick one fixed point pair � b ∗ .
Proof of Lemma (1) a ∗ and � ◮ Pick one fixed point pair � b ∗ . ◮ Choose σ i � ǫ/ 2 for all i .
Proof of Lemma (1) a ∗ and � ◮ Pick one fixed point pair � b ∗ . ◮ Choose σ i � ǫ/ 2 for all i . ◮ It holds that ∃ 0 < δ < ǫ/ 2 n : ( ∀ � u , t , i )( ∀ b i , b ′ i ) | b i − b ′ i | � δ ⇒ | A σ i i ( b i ) − A σ i i ( b ′ i ) | � ǫ/ 2 n
Proof of Lemma (1) a ∗ and � ◮ Pick one fixed point pair � b ∗ . ◮ Choose σ i � ǫ/ 2 for all i . ◮ It holds that ∃ 0 < δ < ǫ/ 2 n : ( ∀ � u , t , i )( ∀ b i , b ′ i ) | b i − b ′ i | � δ ⇒ | A σ i i ( b i ) − A σ i i ( b ′ i ) | � ǫ/ 2 n ◮ Let d i > 0 be the maximum difference between two responses on the entire model space.
Proof of Lemma (1) a ∗ and � ◮ Pick one fixed point pair � b ∗ . ◮ Choose σ i � ǫ/ 2 for all i . ◮ It holds that ∃ 0 < δ < ǫ/ 2 n : ( ∀ � u , t , i )( ∀ b i , b ′ i ) | b i − b ′ i | � δ ⇒ | A σ i i ( b i ) − A σ i i ( b ′ i ) | � ǫ/ 2 n ◮ Let d i > 0 be the maximum difference between two responses on the entire model space. ◮ If d i > δ player is responsive, else not.
Proof of Lemma (2) Consider these two cases: All players are unresponsive, At least one player is responsive. Case 1 ◮ Every possible response lies within δ < ǫ/ 2 n .
Proof of Lemma (2) Consider these two cases: All players are unresponsive, At least one player is responsive. Case 1 ◮ Every possible response lies within δ < ǫ/ 2 n . ◮ Each players utility varies by at most ǫ/ 2 n < ǫ , so they are ǫ -close to optimal.
Proof of Lemma (2) Consider these two cases: All players are unresponsive, At least one player is responsive. Case 1 ◮ Every possible response lies within δ < ǫ/ 2 n . ◮ Each players utility varies by at most ǫ/ 2 n < ǫ , so they are ǫ -close to optimal. ◮ There are no responsive players.
Proof of Lemma (3) δ 2( n + 1) , a model vector � Take a τ < b is good if all b i are good wrt. τ , fairly good if it is good for all responsive i , and bad otherwise. Case 2 The proof consists of two claims b t is (at least) fairly good at least 1 − ǫ 1. If the model vector � of the time, then the three statements of the lemma follow.
Proof of Lemma (3) δ 2( n + 1) , a model vector � Take a τ < b is good if all b i are good wrt. τ , fairly good if it is good for all responsive i , and bad otherwise. Case 2 The proof consists of two claims b t is (at least) fairly good at least 1 − ǫ 1. If the model vector � of the time, then the three statements of the lemma follow. b t is fairly good at least 1 − ǫ of the time 2. The model vector �
Proof of Lemma (4) Remember ∃ 0 < δ < ǫ/ 2 n : ( ∀ � u , t , i )( ∀ b i , b ′ i ) | b i − b ′ i | � δ ⇒ | A σ i i ( b i ) − A σ i i ( b ′ i ) | � ǫ/ 2 n Claim 1 1. From � b being fairly good we can deduce that | a i − A σ i i ( B i ( a i )) | � ǫ/ 2 n for responsive players, and for unresponsive players it is 0 � δ � ǫ/ 2 n , putting this together we get an upper bound of n • ǫ/ 2 n , that is ǫ/ 2 , this leads to statement 1 of the lemma.
Proof of Lemma (4) Remember ∃ 0 < δ < ǫ/ 2 n : ( ∀ � u , t , i )( ∀ b i , b ′ i ) | b i − b ′ i | � δ ⇒ | A σ i i ( b i ) − A σ i i ( b ′ i ) | � ǫ/ 2 n Claim 1 1. From � b being fairly good we can deduce that | a i − A σ i i ( B i ( a i )) | � ǫ/ 2 n for responsive players, and for unresponsive players it is 0 � δ � ǫ/ 2 n , putting this together we get an upper bound of n • ǫ/ 2 n , that is ǫ/ 2 , this leads to statement 1 of the lemma. 2. Since we transformed the game to a game with payoffs between 0 and 1 we know because statement 1 of the lemma holds that the maximal difference in utility with the fixed point is σ i � ǫ/ 2 , thus the maximal difference between two models in that range will be 2 ǫ/ 2 .
Recommend
More recommend