visualization for communication
play

Visualization for Communication cs109a (CNN) the previous day - PowerPoint PPT Presentation

Visualization for Communication cs109a (CNN) the previous day (PCSSCA) (VST, Tufte) Engineer deck, the previous day (PCSSCA) (PCSSCA) RISK ASSESSMENT? (VST, Tufte) Chartjunk at hearings (PCSSCA) Ask an interesting What is the


  1. Visualization for Communication cs109a

  2. (CNN)

  3. the previous day… (PCSSCA)

  4. (VST, Tufte)

  5. Engineer deck, the previous day… (PCSSCA)

  6. (PCSSCA)

  7. RISK ASSESSMENT? (VST, Tufte)

  8. Chartjunk at hearings (PCSSCA)

  9. Ask an interesting What is the scientific goal ? What would you do if you had all the data ? question. What do you want to predict or estimate ? How were the data sampled ? Get the data. Which data are relevant ? Are there privacy issues? Plot the data. Explore the data. Are there anomalies ? Are there patterns ? Build a model. Model the data. Fit the model. Validate the model. Communicate and What did we learn ? Do the results make sense ? visualize the results. Can we tell a story ?

  10. Visualization Goals Communicate (Explanatory) Present data and ideas Explain and inform Provide evidence and support Influence and persuade Analyze (Exploratory) Explore the data Assess a situation Determine how to proceed Decide what to do

  11. Communicate New York Times

  12. Napoleon’s March to Russia https://robots.thoughtbot.com/analyzing-minards-visualization-of-napoleons-1812-march

  13. Minard’s Graphic on Napoleon’s Russia Campaign (from wikipedia)

  14. Minard’s Graphic on Napoleon’s Russia Campaign (from wikipedia)

  15. Key Considerations • Who is your audience ? • What questions are you answering? • Why should the audience care ? • What are your major insights and surprises? • What change to you want to affect?

  16. Effective Visualizations 1. Have graphical integrity 2. Keep it simple 3. Use the right display 4. Use color strategically 5. Know your audience

  17. Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  18. http://lsr.nellco.org/cgi/viewcontent.cgi?article=1476&context=nyu_plltwp Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  19. Keep it Simple

  20. Don’t Make Them Think! • Your audience does not want to spend cognitive effort on things you know and can just show them • Lead them through the major steps of your story • Point out interesting key facts and insights using captions and annotations

  21. Don’t Bury the Lead Cole Nussbaumer

  22. Don’t Bury the Lead Cole Nussbaumer

  23. Use the right display

  24. Most } Efficient Quantitative } Ordered Least } Categories Efficient C. Mulbrandon VisualizingEconomics.com

  25. Most Effective VisualizingEconomics.com

  26. Less Effective VisualizingEconomics.com

  27. 600 600 600 500 500 500 400 400 400 300 300 300 200 200 200 100 100 100 0 0 0 0 100 200 300 400 500 600 2009 10 11 12 13 14 2015 A B C D E F G H I J K L M 100 50 10 75 50 25 5 25 0 0 0 0 25 50 75 100 2009 10 11 12 13 14 2015 E F G H I J K L M Possible solution to cases when you have data that diverge a lot Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  28. Use color strategically

  29. Colors for Categories Do not use more than 5-8 colors at once Ware, “Information Visualization”

  30. Colors for Ordinal Data Vary luminance and saturation Zeilis et al, 2009, “Escaping RGBland: Selecting Colors for Statistical Graphics”

  31. Colors for Quantitative Data Hue Luminance (Rainbow) Luminance & Hue Rogowitz and Treinish, Why should engineers and scientists be worried about color?

  32. X Rainbow Colormap

  33. Rainbow Colormap Perceptually nonlinear R. Simmon

  34. Gray

  35. Color Blindness Protanope Deuteranope Tritanope Red / green Blue / Yellow deficiencies deficiency Based on slide from Stone

  36. Color Blindness Normal Protanope Deuteranope Lightness Based on slide from Stone

  37. Viridis

  38. Color Brewer Nominal Ordinal Cynthia Brewer, Color Use Guidelines for Data Representation

  39. Know your audience

  40. • What do they know? • What motivates them? What do they desire? • What experiences do you share? What are common goals? • What insights can you give them? What tools and “magical gifts”?

  41. What is the message? Exploratory 
 Explanatory 
 Neutral Opinionated

  42. Andy Cotgreave, Tableau

  43. Framing - Why should I care? • Tell the audience: “Here is the right way to think about the problem I was trying to solve.” • Catch the audience’s attention and frame the story using captions and annotations • If done well, your insights will seem obvious given this framing. And that’s a good thing!

  44. Gun Deaths in 2010

  45. Tools for interactive graphics • R/shiny • plotly/dash • Tableau • d3.js • vega-lite/vega

  46. Is there a story? Surface it….even if it is incomplete

  47. 2014 Gun Deaths

  48. (XKCD)

  49. Deaths by county, 2014 (crimeresearch.org)

  50. Careful with amalgamation paradoxes and with outliers http://journal.frontiersin.org/article/10.3389/fpsyg.2013.00513/full Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  51. Ask Ask Ask • Is the exact distribution of guns really the important concern? • did we check the uncertainties? • Should we be looking at this from a “risk” perspective? • we tend to believe what we believe and look for confirmation. • we need to be disciplined about interrogating ourselves • it is ok (and not against simplicity) to surface our process

  52. a woman’s age vs. the age of the men who look best to her 20 23 21 23 22 24 23 25 24 25 25 26 (from Dataclysm) 27 26 28 27 28 29 29 29 30 30 31 31 32 31 33 32 34 32 35 34 35 36 36 37 38 37 39 38 40 38 41 38 42 39 43 39 44 39 45 40 38 46 47 39 48 40 49 45 50 46 Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  53. a man’s age vs. the age of the women who look best to him 20 20 21 20 21 22 21 23 24 21 25 21 26 22 27 21 28 20 29 20 30 20 31 20 20 32 20 33 34 20 35 20 36 20 37 22 38 20 39 20 40 21 41 21 20 42 23 43 44 21 45 24 46 20 47 20 48 23 49 20 50 22 Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  54. Sample of 100 men of 40 vs. the age of the women most common value: 21 who look best to them Number =1 of men of men 20 25 30 35 40 45 50 Women’s ages Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  55. Sample of 100 men of 40 vs. the age of the women most common value: 21 who look best to them Number =1 of men of men 20 25 30 35 40 45 50 Women’s ages most common value: 21 20 25 30 35 40 45 50 Alberto Cairo • University of Miami • www.thefunctionalart.com • Twitter: @albertocairo

  56. Structure of communication graphics

  57. E. Segel

  58. E. Segel

  59. E. Segel

  60. E. Segel

  61. Headline Annotations Call Out Boxes Captions E. Segel

  62. M. Krzywinski & A. Cairo

  63. Application to modeling

  64. IMAC I: inferential goal (scientific question of interest) M: model (all models are wrong, some are useful) A: algorithms C: conclusions and checking The C is crucial: what did we learn? Was the model useful, and how well does it fit? How do we know whether the method is working? Do we understand how it is working? Do we need to iterate and improve the model? What are the limitations and future directions?

  65. (from Foster and Fawcett) Breast Cancer on a Mammogram • False positives OK • False Negatives are disaster • More people dont have it Which Model is Better?

  66. Communicating a model

  67. Telecom Churn Problem Survey 1000 customers , with an offer with an administrative cost of $3 and an offer cost of $100, an incentive for the customer to stay with us. Want to predict for our 100000 customer base. If a customer leaves us, we lose the customer lifetime value, which is some kind of measure of the lost profit from that customer. Lets assume this is the average number of months a customer stays with the telecom times the net revenue from the customer per month. We'll assume 3 years and $30/month margin per user lost, for roughly a $1000 loss. admin_cost=3 offer_cost=100 clv=1000 # customer lifetime value • TN=people we predicted not to churn who wont churn. We associate no cost with this as they continue being our customers • FP=people we predict to churn. Who wont. Lets associate a admin_cost+offer_cost cost per customer with this as we will spend some money on getting them not to churn, but we will lose this money. • FN=people we predict wont churn. And we send them nothing. But they will. This is the big loss, the clv • TP= people who we predict will churn. And they will. These are the people we can do something with. So we make them an offer. Say a fraction f accept it. Our cost is admin_cost + f*offer_cost + (1-f)*clv. f = 0.5 tnc = 0. fpc = admin_cost + offer_cost fnc = clv tpc = admin_cost + f * offer_cost + (1. - f)*clv

  68. Average Cost = TN x TNC 0 103 + TP x TPC + FN x FNC + TP x TPC 1000 553

Recommend


More recommend