Announcement Ø Grades for HW2 and project proposal are released 1
CS6501: T opics in Learning and Game Theory (Fall 2019) Learning from Strategically Transformed Samples Instructor: Haifeng Xu Part of the Slides are provided by Hanrui Zhang
Outline Ø Introduction Ø The Model and Results 3
Signaling Q : Why attending good universities? Q : Why publishing and presenting at top conferences? Q : Why doing internships? 4
Signaling Q : Why attending good universities? Q : Why publishing and presenting at top conferences? Q : Why doing internships? Ø All in all, these are just signals (directly observable) to indicate “excellence” (not directly observable) 5
Signaling Q : Why attending good universities? Q : Why publishing and presenting at top conferences? Q : Why doing internships? Ø All in all, these are just signals (directly observable) to indicate “excellence” (not directly observable) Ø Asymmetric information between employees and employers 2001 Nobel Econ Price is awarded to research on asymmetric information 6
Signaling Ø A simple example • We want to hire an Applied ML researcher • Only two types of ML researchers in this world • Easy to tell COLT TML theoretical idea NeurIPs AML applied idea KDD Σ : Signals 𝑇 : Samples 𝑀 : hidden (observable) (unobservable) types/labels 7
Signaling Ø A simple example • We want to hire an Applied ML researcher • Only two types of ML researchers in this world • Easy to tell COLT TML theoretical idea NeurIPs AML applied idea KDD Σ : Signals 𝑇 : Samples 𝑀 : hidden (observable) (unobservable) types/labels Our world is known to be noisy…. 8
Signaling Ø A simple example • We want to hire an Applied ML researcher • Only two types of ML researchers in this world 0.8 COLT TML theoretical idea 0.2 NeurIPs 0.2 AML applied idea KDD 0.8 reporting strategy Σ : Signals 𝑇 : Samples 𝑀 : hidden (observable) (unobservable) types/labels 𝑚 ∈ 𝑀 is a distribution generated by 𝑚 over ideas 9
Signaling Ø Agent’s problem: • How do I distinguish myself from other types? • How many ideas do I need for that? Ø Principle’s problem: • How do I tell AML agents from others (a classification problem)? • How many papers should I expect to read? Answers for this particular instance? 10
Signaling Ø Agent’s problem: • How do I distinguish myself from other types? • How many ideas do I need for that? Ø Principle’s problem: • How do I tell AML agents from others (a classification problem)? • How many papers should I expect to read? Generally, classification with strategically transformed samples 11
What Instances May Be Difficult? 0.4 COLT TML theoretical idea 0.4 0.2 NeurIPs middle idea 0.2 0.4 AML applied idea KDD 0.4 reporting strategy Σ : Signals 𝑇 : Samples 𝑀 : hidden (observable) (unobservable) types/labels Intuitions Ø Agent: try to report as far from others as possible Ø Principal: examine a set of signals that maximally separate AML from TML 12
Outline Ø Introduction Ø The Model and Results 13
Model Ø Two distribution types/labels: 𝑚 ∈ {, 𝑐} • should be interpreted as “desired”, not necessarily good or bad Ø , 𝑐 ∈ Δ(𝑇) where 𝑇 is the set of samples Ø Bipartite graph 𝐻 = (𝑇 ∪ Σ, 𝐹) captures feasible signals for each sample: 𝑡, 𝜏 ∈ 𝐹 iff 𝜏 is a valid signal for 𝑡 Ø , 𝑐, 𝐻 publicly known; 𝑇, Σ both discrete Ø Distribution 𝑚 ∈ , 𝑐 generates 𝑈 samples 14
Model Ø Two distribution types/labels: 𝑚 ∈ {, 𝑐} • should be interpreted as “desired”, not necessarily good or bad Ø , 𝑐 ∈ Δ(𝑇) where 𝑇 is the set of samples Ø Bipartite graph 𝐻 = (𝑇 ∪ Σ, 𝐹) captures feasible signals for each sample: 𝑡, 𝜏 ∈ 𝐹 iff 𝜏 is a valid signal for 𝑡 Ø , 𝑐, 𝐻 publicly known; 𝑇, Σ both discrete Ø Distribution 𝑚 ∈ , 𝑐 generates 𝑈 samples Ø A few special cases • Agent can hide samples, as in last lecture (captured by adding a “empty signal”) • Signal space may be the same as samples (i.e., 𝑇 = Σ ); 𝐻 captures feasible “lies” 15
The Game Agent’s reporting strategy 𝜌 transform 𝑈 samples to a set 𝑆 of 𝑈 signals Ø A reporting strategy is a signaling scheme • Fully described by 𝜌 𝜏 𝑡 = prob of sending signal 𝜏 for sample 𝑡 • ∑ = 𝜌 𝜏 𝑡 = 1 for all 𝑡 𝜌 𝜏 𝑡 16
The Game Agent’s reporting strategy 𝜌 transform 𝑈 samples to a set 𝑆 of 𝑈 signals Ø A reporting strategy is a signaling scheme • Fully described by 𝜌 𝜏 𝑡 = prob of sending signal 𝜏 for sample 𝑡 • ∑ = 𝜌 𝜏 𝑡 = 1 for all 𝑡 Ø Given 𝑈 samples, 𝜌 generates 𝑈 signals (possibly randomly) as an agent report 𝑆 ∈ Σ ? Ø A special case is deterministic reporting strategy 𝜌 𝜏 𝑡 17
The Game Agent’s reporting strategy 𝜌 transform 𝑈 samples to a set 𝑆 of 𝑈 signals Ø Objective: maximize probability of being accepted Principal’s action 𝑔: Σ ? → [0,1] maps agent’s report to an acceptance prob Ø Objective: minimize prob of mistakes (i.e., reject or accept 𝑐 ) Remark: Ø Timeline: principal announces 𝑔 first; agent then best responds Ø Type ’s [ 𝑐 ’s] incentive is aligned with [opposite to] principal 18
A Simpler Case Ø Say 𝑚 ∈ {, 𝑐} generates 𝑈 = ∞ many samples Ø Any reporting strategy 𝜌 generates a distribution over Σ • Pr(𝜏) = ∑ H∈I 𝜌 𝜏 𝑡 ⋅ 𝑚(𝑡) = 𝜌 𝜏|𝑚 (slight abuse of notation) • 𝜌 𝜏|𝑚 is linear in variables 𝜌 𝜏 𝑡 Ø Intuitively, type should make his 𝜌 “far from” other’s distribution • Total variance (TV) distance turns out to be the right measure 19
Total Variance Distance Ø Discrete distribution 𝑦, 𝑧 supported on Σ • Let 𝑦 𝐵 = ∑ =∈O 𝑦(𝜏) = Pr =∼Q (𝜏 ∈ 𝐵) 𝑒 ?S 𝑦, 𝑧 = max W [𝑦 𝐵 − 𝑧(𝐵)] = ∑ =: Q = YZ(=) [𝑦 𝜏 − 𝑧(𝜏)] [ [ \ ∑ =: Q = YZ(=) [𝑦 𝜏 − 𝑧(𝜏)] + \ ∑ =:Z = ^Q(=) [𝑧 𝜏 − 𝑦(𝜏)] = These two terms are equal 20
Total Variance Distance Ø Discrete distribution 𝑦, 𝑧 supported on Σ • Let 𝑦 𝐵 = ∑ =∈O 𝑦(𝜏) = Pr =∼Q (𝜏 ∈ 𝐵) 𝑒 ?S 𝑦, 𝑧 = max W [𝑦 𝐵 − 𝑧(𝐵)] = ∑ =: Q = YZ(=) [𝑦 𝜏 − 𝑧(𝜏)] [ [ \ ∑ =: Q = YZ(=) [𝑦 𝜏 − 𝑧(𝜏)] + \ ∑ =:Z = ^Q(=) [𝑧 𝜏 − 𝑦(𝜏)] = [ \ ∑ = |𝑦 𝜏 − 𝑧 𝜏 | = [ \ | 𝑦 − 𝑧 | [ = 21
How Can Distinguish Himself from 𝑐 ? Ø Type uses reporting strategy 𝜌 (and 𝑐 uses 𝜚 ) Ø Type wants 𝜌(⋅ |) to be far from 𝜚(⋅ |𝑐) à What about type 𝑐 ? Ø This naturally motivates a zero-sum game between , 𝑐 max min c 𝑒 ?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 ) = 𝑒 d?S (, 𝑐) ` Game value of this zero-sum game 22
How Can Distinguish Himself from 𝑐 ? Ø Type uses reporting strategy 𝜌 (and 𝑐 uses 𝜚 ) Ø Type wants 𝜌(⋅ |) to be far from 𝜚(⋅ |𝑐) à What about type 𝑐 ? Ø This naturally motivates a zero-sum game between , 𝑐 max min c 𝑒 ?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 ) = 𝑒 d?S (, 𝑐) ` Note 𝑒 d?S , 𝑐 ≥ 0 ….now, what happens if 𝑒 d?S , 𝑐 > 0 ? 23
How Can Distinguish Himself from 𝑐 ? Ø Type uses reporting strategy 𝜌 (and 𝑐 uses 𝜚 ) Ø Type wants 𝜌(⋅ |) to be far from 𝜚(⋅ |𝑐) à What about type 𝑐 ? Ø This naturally motivates a zero-sum game between , 𝑐 max min c 𝑒 ?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 ) = 𝑒 d?S (, 𝑐) ` Note 𝑒 d?S , 𝑐 ≥ 0 ….now, what happens if 𝑒 d?S , 𝑐 > 0 ? Ø has a strategy 𝜌 ∗ such that d ij 𝜌 ∗ ⋅ , 𝜚 ⋅ 𝑐 > 0 for any 𝜚 Ø Using 𝜌 ∗ , can distinguish himself from 𝑐 with constant probability via [ Θ samples r l mno p,q [ • Recall: Θ( s r ) samples suffice to distinguish 𝑦, 𝑧 with 𝑒 ?S 𝑦, 𝑧 = 𝜗 • Principal only needs to check whether report 𝑆 is drawn from 𝜌 ∗ ⋅ or not 24
How Can Distinguish Himself from 𝑐 ? Ø So 𝑒 d?S , 𝑐 > 0 is sufficient for distinguishing from 𝑐 Ø It turns out that it is also necessary Theorem : If 𝑒 d?S , 𝑐 = 𝜗 > 0 , then there is a policy 𝑔 that makes 1. [ w /𝜗 \ . mistakes with probability 𝜀 when #samples 𝑈 ≥ 2 ln If 𝑒 d?S , 𝑐 = 0 , then no policy 𝑔 can separate from 𝑐 2. regardless how large is #samples 𝑈 . 25
How Can Distinguish Himself from 𝑐 ? Ø So 𝑒 d?S , 𝑐 > 0 is sufficient for distinguishing from 𝑐 Ø It turns out that it is also necessary Theorem : If 𝑒 d?S , 𝑐 = 𝜗 > 0 , then there is a policy 𝑔 that makes 1. [ w /𝜗 \ . mistakes with probability 𝜀 when #samples 𝑈 ≥ 2 ln If 𝑒 d?S , 𝑐 = 0 , then no policy 𝑔 can separate from 𝑐 2. regardless how large is #samples 𝑈 . Remarks: Ø Prob of mistake 𝜀 can be made arbitrarily small with more samples Ø We have shown the first part Ø Second part is more difficult to prove, uses an elegant result for matching theory 26
Recommend
More recommend