surrogate scoring rules
play

Surrogate Scoring Rules Juntao Wang, Harvard University Yiling - PowerPoint PPT Presentation

Yang Liu, UCSC Surrogate Scoring Rules Juntao Wang, Harvard University Yiling Chen, Harvard University Research Question: Can we incentivize high-quality prediction when the ground truth is unavailable? Incentivize truthful reporting (P1)


  1. Yang Liu, UCSC Surrogate Scoring Rules Juntao Wang, Harvard University Yiling Chen, Harvard University Ø Research Question: Can we incentivize high-quality prediction when the ground truth is unavailable? Ø Incentivize truthful reporting (P1) Implies Ø Accurate forecasts get higher expected rewards (P2) 2 desirable properties Ø Our Answer: Yes! Surrogate Scoring Rules (SSR) Ø A Motivation Example: - A principal cares “How likely a study can be replicated?” - Forecasters are asked to provide a probabilistic prediction. - The SCORE program crowdsourced this question for 3000 studies to hundreds of researchers, while only 300 will be put into real replication test.

  2. Roadmap. Building block Building block Strictly Proper SSR SSR Mechanism Scoring Rules Access to a noisy No access to Access to ground truth ground truth ground truth Surrogate Scoring Rules (SSR): Strictly proper scoring rules (SPSR) (existing work): D , B C E ) ?(" # , @; B C ! " # , % @ – A noisy ground truth with error rates: D : = Pr @ = 0 % = 1 (known) - B C Report of agent & Ground truth % ∈ {0,1} E : = Pr @ = 1 % = 0 (known) - B C Truthfulness property (P1) Unbiasedness property Def. ! " # , % is SPSR if and only if ∀, # - Enables us to inherit P1 , P2 from SPSR ≠ " # , 0 1~3 M ! , # , % > 0 1~3 M ! " # , % Def. ? " # , @ is SSR if and only if Accuracy property (P2): ∀" # , 0 J ? " # , % = 0 1 ! " # , % - Let , ∗ = true distribution of % Implementation: - For each ! , ∃ divergence function / : E ⋅ ! " # , 1 − B C D ⋅ !(" # , 0) ? " # , @ = 1 = 1 − B J = 56789 − /(, ∗ ||" # ) 0 1~3 ∗ ! " # , % E − B J D 1 − B J D ⋅ ! " # , 0 − B C E ⋅ ! " # , 1 Example: ? " # , @ = 0 = 1 − B J ! " # , % = 1 − " # − % > E − B J D 1 − B J

  3. SSR Mechanisms Setting: SSR Mechanism. - A set of agents A (index % ) When score ! ",$ for agent % , apply SSR: + , ) * , - A set of tasks B (index - ) & ! ",$ , '; ) * - Ground truth > $ ∈ {0,1} 3 1 2 - Belief G ",$ of agents on tasks 1 Ø Construct a noisy ground truth ' : - Report ! ",$ from agents on tasks “For a task - , uniformly randomly pick an agent . ≠ % , draw ' = 1 with probability ! 2,$ ” - A task is assigned to at least 3 agents Ø Estimate the error rates: “Methods of Moments” 2 + , ) * , Ø Apply SSR against ' and ) * 3 Assumptions: Unbiasedness of SSR mechanism to SPSR A1. Tasks are homogeneous and $ + @ 1 A + 1 independent 7 89:;. & ! ",$ , ' = = ! ",$ , > B A2. Beliefs are independent conditioned on > $ Main Theorem A3. Uniform strategy across tasks Under A1~A4, in SSR mechanisms, truthful reporting A4. The principal knows is a uniform dominant strategy for M, N → ∞ , or for $ = 0 . H #> $ = 1 > #> arbitrarily discretized report space. 3

  4. Paper: https://arxiv.org/abs/1802.09158 Contact: juntaowang@g.harvard.edu Experimental Evaluation – – on 14 real-world human forecast datasets Ø Deployed in RM! ( RM blog ) Ø SSR is close to true score (SPSR) Ø SSR has strongest correction to true scores SPSR than others

Recommend


More recommend