1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton - PowerPoint PPT Presentation

How to Win a Forecasting Tournament? Philip E. Tetlock Wharton School CFA Asset Management Forum Montreal, October 8, 2015

WHAT ARE FORECASTING TOURNAMENTS? • level-playing-field competitions to determine who knows what • a disruptive technology that destabilizes stale status hierarchies 3

How Did GJP Win the Tournament? • By assigning the most accurate probability estimates to over 500 outcomes of “national security relevance” • But how did GJP do that?

Winning requires picking battles wisely: Where the Where the Where the ball pendulum hurricane stops swings meanders More Predictable Less Predictable 5

Winning Requires Skill at: Discounting Pseudo-Diagnostic News to Which Spotting Subtly-Diagnostic News to Which: Crowd Over-Reacts Crowd Under-Reacts 1 1 Subjective Probability Subjective Probability Crowd Beliefs 0 0 E1 E1 E2 E3 Time Time 6

And winning requires moving beyond blame- game ping-pong Finding WMD: Over- 9/11: Under- Osama Connecting the Dots Connecting the Dots Bin Laden False-Positives False-Negatives 7

But How Exactly Did GJP JP Pull it Off? Get Right • Spotting/cultivating superforecasters (40% People on Bus boost) Teaming • Anti-groupthink groups (10% boost). Training • Debiasing exercises (10% boost) • Aggregation algorithms that up-weight shrewd Elitist forecasters AND extremize to compensate for Algorithms conservatism of aggregates (25%--plus boost) 8

Obama’s Osama Decision: Through a GJP JP Lens • Hollywood vs. History (the myth and reality of Zero Dark Thirty) • Two Thought-Experiment Variations on Reality • Clones vs. Silos • National Security vs. March Madness

OPTOMETRY TRUMPS PROPHECY  GJP’s methods improve foresight using tested tools: personnel selection, training, teaming, incentives and algorithms  Still a blurry world, just less so: GJP’s best methods assign probabilities of 24- 28% to things that don’t happen/ 72-76% to things that do 10

Ungar’s lo log-odds model beat all ll comers (in includin ing several predic iction markets) • Log-odds with shrinkage + noise m j = a log(p j /(1-p j ) + e • • Amount of transformation, a , depends on sophistication and diversity of forecaster pool 1 Transformed Probability 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 12 Probability

Measuring the accuracy of f probability ju judgments Day Da Probability of of Ra Rain in Ou Outcome of of Ra Rain in Brie Brier Sc Scores (1-.9) 2 + (0-.1) 2 = 1 90% Yes = 100% (1 + (0 = 0.02 0.02 (1-.5) 2 + (0-.5) 2 = 2 50% Yes = 100% (1 + (0 = 0.50 0.50 (0-.5) 2 + (1-.5) 2 = 3 50% No = 0% (0 + (1 = 0.50 0.50 .2) 2 = (1-.8) 2 2 + 4 80% Yes = 100% (1 + (0 (0-.2 = 0.08 0.08 Mean 68% 50% 0.28 13

Measuring Accuracy: Brier Scoring Best Possible Random Worst Possible 2.0 0 .5 Perfect theory of Just Guessing Reverse deterministic Clairvoyance system 14

Breaking Brier Scores Down In Into Two Key Metrics: • Calibration • Resolution 15

Examples of Calibration & Resolution Best Possible Calibration with Poor Resolution 1 0.5 0 0.5 1 Subjective Probability 16

Examples of Calibration & Resolution Best Possible Calibration with Good Resolution 1 0.5 0 0.5 1 Subjective Probability 17

Examples of Calibration & Resolution Best Possible Calibration with Best Possible Resolution 1 0.5 0 0.5 1 Subjective Probability 18

Benchmarking (w (what should count as a good brier score?) • Minimalist: • Dart-throwing chimp • Simple extrapolation/time-series models • Moderately aggressive: • Unweighted mean/median of wisdom of the crowd • Expert consensus panels (Central Banks, EIU, Bloomberg,…) • Maximalist • Most advanced statistical/Big-Data models • Beating deep liquid markets 19

Other Take-Aways fr from the Tournaments • We discovered: • Just how vague “vague verbiage” can be— and how it makes it impossible to keep score • The personality/behavioral profiles of superforecasters • The group-dynamics profiles of superteams • Designing debiasing training that boosts real-world accuracy

Vague verbiage can be very ry vague  Watch what happens when we translate words into quant-equivalence ranges : • it might happen (0.09 to .64) • maybe (QER 0.31 to 0.69) • it could happen (0.02 to .56) • distinct possibility (0.21 to 0.84) • it's a possibility (0.001 to .45) • risky (0.11 to 0.83) • It’s a real possibility (0.22 to 0.89) • some chance (0.05 to 0.42) • it's probable (0.55 to 0.90) • slamdunk or sure thing (QER, 0.95 to 1.0) real possibly possibility could might probable 0 1 Less certain More certain impossible some risky distinct maybe slam dunk / chance possibility sure thing 21

Wolf Krugman Bremmer How Accurate Are Today’s Thought Leaders? Ferguson Friedman Kristol 22

Profiling Superforecasters • Fluid intelligence helps but without… • Active open- mindedness helps but without … • And both combined count for little unless: • You believe probability estimation is a skill that can be cultivated — and is worth cultivating

Profiling Superteams • Somehow manage to check groupthink via precision questioning and constructive confrontation without degrading into factionalism

Yet Goli liath Decid ided to Lend David id Sli lingshot Money • In 2010, IARPA challenged five $5M-per-year research programs to out-predict a $5B-per-year bureaucracy in a 4-year tournament • One of these programs, GJP, won the tournament — by big margins

1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton - PowerPoint PPT Presentation

1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton School CFA Asset Management Forum Montreal, October 8, 2015 WHAT ARE FORECASTING TOURNAMENTS? level-playing-field competitions to determine who knows what a disruptive

Tournament Scoreboards for Military Golf Courses FMI Defense, LLC Professional Tournament

An Industrial Waste Heat Win-Win! Ray Deyoe Managing Director Integral Power, LLC An Industrial

You Can Be an Energy Solutions Partner - ESP 1 Its a Win -Win-Win or (Win 3 ) Customer - ESP -

INNOVATIVE THINKING AT WORK WIN+WIN Jorge Bugallo COACHING WIN+WIN COACHING Cell phone: +34

Win/Win Heifer Grazing Hayden Dore Veterinarian Vet South 1 Win/Win Heifer Grazing Owner

International All-Star Tournament Affidavit Binder 2019 Tournament Team Eligibility and

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

. Eigenspaces of Tournament Matrices 0 1 2 3 4 5 6 7 8 James Burk Eigenspaces of

SOCIETY OF POKEMON MASTERS 4/24/15 MEETING AGENDA Rules Tournament Voting

TOURNAMENT PAPER WORK REVIEW TOURNAMENT PLAYER VERIFICATION FORM Proof of Age Proof of

FSPA Update UEC meeting 5/19/2017 Badminton tournament 2 nd Fermilab Badminton Tournament Time:

FSPA Report UEC Meeting Aug. 10, 2018 1 Badminton Tournament Held a successful badminton

The Manufacturing Bootcamp: Crea5ng a Win- Win-Win

WIN, WIN, WIN - TRIPLE BOTTOM LINE! 5 yrs ago, was working full time on Dads farm, planted 3

Product Knowledge Training Win Dor Meeting Overview Introduction to Win Dor Inc.

An I nternational Dark Sky Park for Bodm in Moor? CARADON OBSE RVATORY , CORNWALL w w w .cornw

Physics of CMB Anisotropies Eiichiro Komatsu (Max-Planck-Institut fr Astrophysik) The CMB

cscmediagroup.com Historic growth of CSC Media 2006 2007 2008 2009 2010 2011 2012 2013 +1

Preliminary Results 16 May 2013 2 FY10/11 FY12/13 Only Phone and Broadband Fastest growing TV

FannieMae March 22, 1999 1 CONFIDENTIAL AND PROPRIETARY BUSINESS INFORMATION CONFIDENTIAL

Hong Kong Residence Visas Hong Kong Residence Visas For 2 Week Visitor Nationals Stephen Barnes May

HSEIP- 2014 (HRIs Yearly Progression and Successes and Not 100% Successes TO BE DISCUSSED

IRT Recommendations on RPMs in the new gTLDs: A Summary How the IRT hopes ICANN will protect the

Sambuz

Useful Links

Newsletter

Mail Us