Surveys Classes on Tuesday (Nov 26)? One (long) lecture for project - PowerPoint PPT Presentation

Surveys Ø Classes on Tuesday (Nov 26)? Ø One (long) lecture for project presentation or two separate lectures? 1

CS6501: T opics in Learning and Game Theory (Fall 2019) Learning From Strategically Revealed Samples Instructor: Haifeng Xu Part of slides by Hanrui Zhang

Outline Ø Introduction and An Example Ø Formal Model and Results Ø Learning from Strategic Samples: Other Works 3

Academia in the Era of Tons Publications S he has 50 A new postdoc papers and I only applicant Alice want to read 3. The Trouble of Bob, a Professor of Rocket Science 4

Academia in the Era of Tons Publications Give me 3 papers by Alice that I need to read. C harlie is excited about hiring A lice Current postdoc Charlie is happy . . . 5

Academia in the Era of Tons Publications I got to pick best 3 papers to persuade Bob, so that he will hire Alice. Charlie shall pick best 3 papers by Alice – I need to calibrate for that They know what each other is thinking… 6

Abstracting the Problem Alice is waiting to hear Ø Setup: (binary-)classify distributions with from bob label 𝑚 ∈ {𝑕, 𝑐} • Opposed to classic problem of classifying samples drawn from distributions Ø Goal: accept good ones ( 𝑚 = 𝑕 ) and reject bad ones ( 𝑚 = 𝑐 ) Ø Previous example: a postdoc candidate = a distribution (over papers) 7

Principal Reacts by Committing to a Policy Ø Principal (Bob) commits to and announces a policy to agent Charlie • He decides whether to accept 𝑚 (hire Alice) based on agent’s report and I want Alice to I will hire Alice if be first author on you give me 3 good at least 2 of them. papers, or 2 excellent papers 8

Agent’s Problem Ø Has access to 𝑜(= 50) samples (papers) from distribution 𝑚 (Alice) • Assume samples are i.i.d. Ø Can choose 𝑛(= 3) samples as his report C harlie is reading through A lice’s 50 papers… 9

Agent’s Problem Ø Has access to 𝑜(= 50) samples (papers) from distribution 𝑚 (Alice) • Assume samples are i.i.d. Ø Can choose 𝑛(= 3) samples as his report Charlie found 3 papers by Alice meeting bob’s Ø Agent (Charlie) sends his report to Bob criteria principal (Bob), aiming to persuade Bob to accept distribution 𝑚 (Alice) He is sure bob will hire Alice upon seeing these 3 papers 10

Principal Executes Based on His Policy Ø Bob observes Charlie’s report, and makes a decision according to the policy he announced it looks like one is not so good, I read the 3 A lice is doing but the other two papers you good work, so are incredible. sent let’s hire her. 11

Strategic Classifications are Everywhere Ø University admissions • Students academic records are selectively revealed 12

Strategic Classifications are Everywhere Ø University admissions • Students academic records are selectively revealed Ø Classify loan lending decisions • Borrowers will selectively report their features APPROVED 13

Strategic Classifications are Everywhere Ø University admissions • Students academic records are selectively revealed Ø Classify loan lending decisions • Borrowers will selectively report their features Ø Decide which restaurants to go based on Yelp rating • Platform may selectively showing you ratings Ø Hiring job candidates in various scenarios 14

Strategic Classifications are Everywhere Ø University admissions • Students academic records are selectively revealed Ø Classify loan lending decisions • Borrowers will selectively report their features Ø Decide which restaurants to go based on Yelp rating • Platform may selectively showing you ratings Ø Hiring job candidates in various scenarios Ø Note: this problem deserves study even you do classification manually instead of using an automated classifier • E.g., deciding where to hold the next Olympics based on photographs of different city locations 15

Outline Ø Introduction and An Example Ø Formal Model and Results Ø Learning from Strategic Samples: Other Works 16

The Model: Basic Setup Ø A distribution 𝑚 ∈ {𝑕, 𝑐} arrives, which can be good ( 𝑚 = 𝑕 ) or bad ( 𝑚 = 𝑐 ) Ø An agent has access to 𝑜 i.i.d. samples from 𝑚 , from which he chooses a subset of exactly 𝑛 samples as his report • Agent’s goal: persuade a principal to accept 𝑚 Ø Principal observes agent’s report, and decides whether to accept • Principal’s goal: accept when 𝑚 = 𝑕 and reject when 𝑚 = 𝑐 • Want to minimize her probability of mistakes 17

The Model: the Timeline a distribution 𝑚 ∈ {𝑕, 𝑐} Objective: accept g and reject b arrives Principal commits to a policy Π(𝑆) ∈ [0, 1] that maps report 𝑆 to probability of accepting 𝑆 𝑚 generates 𝑜 iid samples 𝐸 = {𝑒 2 , ⋯ , 𝑒 4 } Agent receives 𝐸 and report probability 𝑞 = Π(𝑆) 𝑆 = {𝑠 2 , … , 𝑠 < } ⊆ 𝐸 of accepting 𝑚 given report 𝑆 Objective: maximize prob of accepting 𝑚 18

Simpler Case: Agent is NOT Strategic Ø This is the same as distinguishing two distributions from samples • You have 𝑛 samples from distribution either 𝑕 or 𝑐 • Want to tell which one it is, with high probability (you almost can never be 100% certain) Fact : Let 𝜗 = max C [𝑕 𝑇 − 𝑐 𝑇 ] be total variation (TV) distance between 𝑕, 𝑐 . Then Ω(1/𝜗 H ) samples to distinguish 𝑕, 𝑐 with constant success probability. Note: 𝑕(𝑇) = Pr K∼M (𝑦 ∈ 𝑇) is accumulated probability for 𝑦 ∈ 𝑇 19

Simpler Case: Agent is NOT Strategic Ø This is the same as distinguishing two distributions from samples • You have 𝑛 samples from distribution either 𝑕 or 𝑐 • Want to tell which one it is, with high probability (you almost can never be 100% certain) Fact : Let 𝜗 = max C [𝑕 𝑇 − 𝑐 𝑇 ] be total variation (TV) distance between 𝑕, 𝑐 . Then Ω(1/𝜗 H ) samples to distinguish 𝑕, 𝑐 with constant success probability 𝑐 𝑕 Formally, 𝑕 − 𝑐 OP = Q [𝑕 𝑦 − 𝑐(𝑦)]𝑒𝑦 𝑕 − 𝑐 K:M K ST(K) OP 20 Illustration of TV distance

Simpler Case: Agent is NOT Strategic Ø This is the same as distinguishing two distributions from samples • You have 𝑛 samples from distribution either 𝑕 or 𝑐 • Want to tell which one it is, with high probability (you almost can never be 100% certain) Fact : Let 𝜗 = max C [𝑕 𝑇 − 𝑐 𝑇 ] be total variation (TV) distance between 𝑕, 𝑐 . Then Ω(1/𝜗 H ) samples to distinguish 𝑕, 𝑐 with constant success probability Proof Ø First, compute S ∗ = arg max C [𝑕 𝑇 − 𝑐 𝑇 ] Ø Idea: try to estimate value of 𝑚(𝑇 ∗ ) where 𝑚 ∈ {𝑕, 𝑐} • Why? This statistics has largest gap among 𝑕, 𝑐 Ø How to estimate 𝑚(𝑇 ∗ ) from samples? • Calculate fraction of samples in 𝑇 ∗ Ø Ω(1/𝜗 H ) samples suffices to distinguish random variable 𝑕(𝑇 ∗ ) from 𝑐(𝑇 ∗ ) 21

Simpler Case: Agent is NOT Strategic Ø This is the same as distinguishing two distributions from samples • You have 𝑛 samples from distribution either 𝑕 or 𝑐 • Want to tell which one it is, with high probability (you almost can never be 100% certain) Fact : Let 𝜗 = max C [𝑕 𝑇 − 𝑐 𝑇 ] be total variation (TV) distance between 𝑕, 𝑐 . Then Ω(1/𝜗 H ) samples to distinguish 𝑕, 𝑐 with constant success probability Remarks Ø When agent is not strategic, performance depends on TV 2 distance in the form of Ω X Y 22

Strategic Agent: An Example “Tough” World Ø A good candidate writes a good paper w.p. 0.05 Ø A bad candidate writes a good paper w.p. 0.005 Ø All candidates have 𝑜 = 50 papers, and the professor wants to read only 𝑛 = 1 good candidate Q : What is a reasonable principal policy? 23

Strategic Agent: An Example “Tough” World Ø A good candidate writes a good paper w.p. 0.05 Ø A bad candidate writes a good paper w.p. 0.005 Ø All candidates have 𝑜 = 50 papers, and the professor wants to read only 𝑛 = 1 good candidate Q : What is a reasonable principal policy? Ø Accept iff the reported paper is good 1 − 0.05 [\ ≈ 0.92 • Good candidate is accepted with prob 𝑞 M = 1 − 1 − 0.005 [\ ≈ 0.22 • A bad candidate is accepted with prob 𝑞 T = 1 − à almost cannot distinguish Ø What happens if agent not strategic? Ø Strategic selection actually helps principal! 24

Strategic Agent: An Example “Easy” World Ø A good candidate writes a good paper w.p. 0.05 0.95 Ø A bad candidate writes a good paper w.p. 0.005 0.05 Ø All candidates have 𝑜 = 50 papers, and the professor wants to read only 𝑛 = 1 good candidate 25

Strategic Agent: An Example “Easy” World Ø A good candidate writes a good paper w.p. 0.05 0.95 Ø A bad candidate writes a good paper w.p. 0.005 0.05 Ø All candidates have 𝑜 = 50 papers, and the professor wants to read only 𝑛 = 1 good candidate Policy : Accept iff the reported paper is good 1 − 0.95 [\ ≈ 1 Ø Good candidate is accepted with prob 𝑞 M = 1 − 1 − 0.05 [\ ≈ 0.92 Ø A bad candidate is accepted with prob 𝑞 T = 1 − à can distinguish easily Ø What happens if agent not strategic? Ø Here, strategic selection hurts principal! 26

Surveys Classes on Tuesday (Nov 26)? One (long) lecture for project - PowerPoint PPT Presentation

Surveys Classes on Tuesday (Nov 26)? One (long) lecture for project presentation or two separate lectures? 1 CS6501: T opics in Learning and Game Theory (Fall 2019) Learning From Strategically Revealed Samples Instructor: Haifeng Xu

V1E 12 Sept 2016 Surveys V1 2016 SLDM Surveys 1 V1 2015 StatChat2 2 2 Polls and Surveys

Bat surveys undertaken in 2017 1. Roost Assessment Surveys 2. Activity transects 3. Crossing

AGN populations in X-ray surveys Contents Advantage of X-ray surveys What they find

Survey Results November 2, 2020 Parent/Guardian Survey Staff Surveys Student Surveys

SURVEYS (CONTINUED) Michael Coblenz WHY SURVEYS? Generalize your findings Shallower than

US Decadal Surveys David Spergel Tokyo (via ZOOM) Multiple Decadal Surveys Astrophysics

Galactic X-ray Surveys and Galactic X-ray Source Populations Bob Warwick University of

RECRUITMENT COSTS SURVEY Presentations (Sessions 1 & 2) Surveys on Migration Costs: An

Dark Matter in the Milky Way - how to find it using Gaia and other surveys Paul McMillan Surveys

Panorama Surveys Distance Learning & Community Needs Surveys administered to: Students,

EIMR Conference April 2014 Seabird surveys in high energy sites; marrying best practise and

2017 VIMS-Industry Cooperative Surveys Nematode Observations For the 2017 surveys, VIMS continued

Previous surveys in 2011, 2012, 2013 In April 2015, a total of 432 surveys of visitors

Introduction to Survey Statistics Day 3 Measurement in Surveys Federico Vegetti Central

Assessment Surv rvey Results January 15, 2020 Marissa Mortiboy 247 volunteers (both surveys)

Visual Litter Surveys Where we came from where we are now and a new approach to Visual Litter

Privacy Surveys Privacy Surveys Week 12 - April 6, 8 1 Privacy Policy, Law and Technology

Building and using detection models for ecological surveys Cindy Hauser

Examining Feedback Surveys from NCI Interviews in Florida What was the experience of people

D. McArthur 1 of 1 Dwg No Date Douglas LAND SURVEYS 11515-01 08/04/2015 Douglas LAND SURVEYS

Combining Estimates from Related Surveys via Bivariate Models (Application: using ACS estimates

Soil Series Soil Series Understanding Soil Understanding Soil Surveys & Map Units Surveys

Thomas Coutrot (DARES) The collective labour relations in statistical surveys WORKING

Surveys, interviews, and diary studies Michelle Mazurek (some slides adapted from Blase Ur,

Surveys Classes on Tuesday (Nov 26)? One (long) lecture for project - PowerPoint PPT Presentation

Surveys Classes on Tuesday (Nov 26)? One (long) lecture for project presentation or two separate lectures? 1 CS6501: T opics in Learning and Game Theory (Fall 2019) Learning From Strategically Revealed Samples Instructor: Haifeng Xu

V1E 12 Sept 2016 Surveys V1 2016 SLDM Surveys 1 V1 2015 StatChat2 2 2 Polls and Surveys

Bat surveys undertaken in 2017 1. Roost Assessment Surveys 2. Activity transects 3. Crossing

AGN populations in X-ray surveys Contents Advantage of X-ray surveys What they find

Survey Results November 2, 2020 Parent/Guardian Survey Staff Surveys Student Surveys

SURVEYS (CONTINUED) Michael Coblenz WHY SURVEYS? Generalize your findings Shallower than

US Decadal Surveys David Spergel Tokyo (via ZOOM) Multiple Decadal Surveys Astrophysics

Galactic X-ray Surveys and Galactic X-ray Source Populations Bob Warwick University of

RECRUITMENT COSTS SURVEY Presentations (Sessions 1 &amp; 2) Surveys on Migration Costs: An

Dark Matter in the Milky Way - how to find it using Gaia and other surveys Paul McMillan Surveys

Panorama Surveys Distance Learning &amp; Community Needs Surveys administered to: Students,

EIMR Conference April 2014 Seabird surveys in high energy sites; marrying best practise and

2017 VIMS-Industry Cooperative Surveys Nematode Observations For the 2017 surveys, VIMS continued

Previous surveys in 2011, 2012, 2013 In April 2015, a total of 432 surveys of visitors

Introduction to Survey Statistics Day 3 Measurement in Surveys Federico Vegetti Central

Assessment Surv rvey Results January 15, 2020 Marissa Mortiboy 247 volunteers (both surveys)

Visual Litter Surveys Where we came from where we are now and a new approach to Visual Litter

Privacy Surveys Privacy Surveys Week 12 - April 6, 8 1 Privacy Policy, Law and Technology

Building and using detection models for ecological surveys Cindy Hauser

Examining Feedback Surveys from NCI Interviews in Florida What was the experience of people

D. McArthur 1 of 1 Dwg No Date Douglas LAND SURVEYS 11515-01 08/04/2015 Douglas LAND SURVEYS

Combining Estimates from Related Surveys via Bivariate Models (Application: using ACS estimates

Soil Series Soil Series Understanding Soil Understanding Soil Surveys &amp; Map Units Surveys

Thomas Coutrot (DARES) The collective labour relations in statistical surveys WORKING

Surveys, interviews, and diary studies Michelle Mazurek (some slides adapted from Blase Ur,

RECRUITMENT COSTS SURVEY Presentations (Sessions 1 & 2) Surveys on Migration Costs: An

Panorama Surveys Distance Learning & Community Needs Surveys administered to: Students,

Soil Series Soil Series Understanding Soil Understanding Soil Surveys & Map Units Surveys