error bars considered harmful
play

Error Bars Considered Harmful Exploring Alternate Encodings for Mean - PowerPoint PPT Presentation

Error Bars Considered Harmful Exploring Alternate Encodings for Mean and Error Michael Correll Michael Gleicher University of Wisconsin-Madison Dont Use Error Bars They dont work as advertised Try something else instead! Dont Use


  1. Error Bars Considered Harmful Exploring Alternate Encodings for Mean and Error Michael Correll Michael Gleicher University of Wisconsin-Madison

  2. Don’t Use Error Bars They don’t work as advertised Try something else instead!

  3. Don’t Use Error Bars They don’t work as advertised

  4. 100 80 60 40 20 0 Placebo Treatment

  5. 100 80 60 40 20 0 Placebo Treatment

  6. 100 80 60 40 20 0 Placebo Treatment

  7. p<0.05 100 80 60 40 20 0 Placebo Treatment

  8. * 100 80 60 40 20 0 Placebo Treatment

  9. Error Bars: Are ambiguous Are asymmetric Are “all or nothing”

  10. Error Bars: Are ambiguous Are asymmetric Are “all or nothing”

  11. InfoVis 2010-2013 100% 90% 80% 70% 60% Labeled 50% Unlabeled 40% 30% 20% 10% 0%

  12. InfoVis 2010-2013 Standard error 95% t confidence interval Range 1.5 x interquartile range Standard deviation 80% t confidence interval

  13. 100 80 60 40 20 0 Placebo Treatment

  14. Error Bars: Are ambiguous Are asymmetric Are “all or nothing”

  15. 100 80 60 40 20 0 Placebo Treatment

  16. 100 80 60 40 20 0 Placebo Treatment

  17. 100 80 60 40 20 0 Placebo Treatment

  18. Within-the-bar bias 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 Newman, George E, and Brian J Scholl. “Bar graphs depicting averages are perceptually misinterpreted: the within-the-bar bias.” Psychonomic bulletin & review 19.4 (2012): 601–7.

  19. Within-the-bar bias 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0

  20. Within-the-bar bias 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0

  21. Error Bars: Are ambiguous Are asymmetric Are “all or nothing”

  22. 80 70 60 50 40 30 20 10 0

  23. 80 70 60 50 40 30 20 10 0

  24. 80 70 60 50 40 30 20 10 0

  25. Don’t Use Error Bars They don’t work as advertised Try something else instead!

  26. A solution? 100 100 80 80 60 60 40 40 20 20 0 0 Placebo Treatment Placebo Treatment

  27. Design Requirements Consistent Symmetric Continuous ?

  28. Design Requirements Consistent Symmetric Continuous ?

  29. Design Requirements Consistent Symmetric Continuous ?

  30. Bar Chart

  31. Violin Plot 95% t-confidence interval J. Hintze and R. Nelson. Violin plots: a box plot-density trace synergism. The American Statistician , 1998.

  32. Gradient Plot? 95% t-confidence interval

  33. Gradient Plot “100%” t-confidence 95% t-confidence interval interval

  34. Methods 3 experiments on Amazon Mechanical Turk, 240 participants 3 problem frames (election polling, weather forecasting, financial modeling) No prerequisite of statistical knowledge Participants gave a predictions as either binary forced choice, or on a Likert scale

  35. One Sample Judgments How likely (or how surprising) do you think the red potential outcome is, given the poll?

  36. Results

  37. “Within the bar” bias Error bars suffer from this bias… but other encodings don’t

  38. Two Sample Judgments If forced to guess, which city do you predict will get more snow?

  39. Overconfidence Error bars make people unjustifiably confident… but other encodings don’t

  40. Costs are low p-value Effect size

  41. Don’t Use Error Bars They don’t work as advertised Try something else instead!

  42. What’s next? More encodings More testing Real stakes

  43. Make your own! http://graphics.cs.wisc.edu/Vis/ErrorBars/

  44. Acknowledgments This work was supported in part by NSF award IIS-1162037, NIH award R01 AI077376, and ERC Advanced Grant “Expressive.” Thanks to Wei-Chen Chen for web generation code. Visit: http://graphics.cs.wisc.edu/Vis/ErrorBars/ to make your own plots! (and for data tables, stimuli, and sample experiments). Contact: mcorrell@cs.wisc.edu

  45. Box Plot 95% t-confidence 50% t-confidence interval interval

  46. Müller-Lyer Illusion W. Stock and J. Behrens. Box, Line, and Midgap plots: Effects of Display Characteristics on the Accuracy and Bias of Estimates of Whiskey Length. Journal of Educational and Behavioral Statistics, 1991

  47. Müller-Lyer Illusion W. Stock and J. Behrens. Box, Line, and Midgap plots: Effects of Display Characteristics on the Accuracy and Bias of Estimates of Whiskey Length. Journal of Educational and Behavioral Statistics, 1991

  48. p=.05? 100 80 60 40 20 0 Placebo Treatment

  49. p<0.01 100 80 60 40 20 0 Placebo Treatment

  50. p=.05 100 80 60 40 20 0 Placebo Treatment

  51. p=.05 100 80 60 40 20 0 Placebo Treatment

  52. “p-pdf”

Recommend


More recommend