3 visualizing quantitative information
play

3 Visualizing quantitative Information 1 Outline New ideas - PowerPoint PPT Presentation

Elective in Software and Services (Complementi di software e servizi per la societ dell'informazione) Section Inf nfor ormat ation V on Visual sualizat ation on Numbers of credit : 3 Gius usep eppe pe S Sant antucci 3


  1. Data & relationships summary • Quantitative information consist of two types of data – Quantitative – Categorical • Relationship among data could be – Simple associations between quantitative and categorical subdivision – More complex association among multiple set of values • Four types of relationship within categories – Nominal – Ordinal – Interval – Hierarchical • Three types of relationships between quantitative values – Ranking – Ratio – Correlation 43

  2. Numbers that summarize • Measures of average – Mean – Median – Mode – Midrange • Measures of distribution – Range – Variance – Standard deviation 44

  3. Mean • Nothing to say but that sometimes it is not informative 45

  4. Median • It splits the sorted distribution in two 46

  5. Moda and midrange (mmm...) • Moda is just the most common element • Midrange is (max+min)/2 • Moda=165,000 • Midrange =(475,000+25,000)/2= 250,000 47

  6. Distribution Warehouse Sum of Delivery mean Delivery shipping days median A 51 4.25 4.5 B 51 4.25 4.5 • Performances of delivery time of 12 orders of two warehouses • Do they perform the same? • What is missing? 48

  7. Distribution Order # Warehouse Warehouse A B 1 3 1 2 3 1 3 3 1 4 4 3 5 4 3 6 4 4 7 5 5 8 5 5 9 5 5 10 5 6 11 5 7 12 5 10 49

  8. Range Order # Warehouse A Warehouse B 1 3 1 • Range is just max-min 2 3 1 3 3 1 • Range A = 2 4 4 3 5 4 3 • Range B = 9 6 4 4 7 5 5 8 5 5 9 5 5 10 5 6 11 5 7 12 5 10 50

  9. Standard deviation • This variability is well described by variance and standard deviation µ = (x 1 + x 2 +... +x n )/N • mean: var =[(x 1 -m) 2 + (x 2 -m) 2 +...(x n -m) 2 ]/N • variance standard deviation σ =var 1/2 • • However such concepts are hard to communicate P ~70% of data µ µ - σ µ + 1.96 σ µ + σ X µ - 1.96 σ 68.26% dei dati 51 95% dei dati

  10. Standard deviation • These bar charts compare values with mean, providing a simpler way of communicating standard deviation 52

  11. Measures of ratio • Simple numerical relationship between two values • It can be used to summarize data as well 53

  12. Money (but also college grades) • It is one of the few measure whose scale changes across time – inflation / deflation – change rate • In comparisons you have to take that into account http://www.gapminder.org/ 54

  13. Number that summarize 55

  14. Outline • New ideas about good and bad graphs • Meaning of numbers • Tables and graphs • Basic table variations • Basic graph variations • Relationships in graphs 56

  15. Table and graphs • Table and graphs are widely used to communicate quantitative information • Sometimes it is better to just show the (few) numbers • The goals of presenting quantitative data are – Analyzing – Monitoring – Planning – Communicating • Remember that we are dealing with data that is – Quantitative – Categorical • Not all numbers carry quantitative information – Categorical intervals – IDs (e.g., order number) 57

  16. A very bad table… 37.2 28.39 58

  17. Quantitative or categorical ? • X axes ? • Y axes ? • Legend ? • Bars? • Title? 59

  18. A table without quantitative values Monday : Fondamenti di Informatica Tuesday: Fondamenti di Informatica Wednesday: Fondamenti di Informatica + Inf. Visualization Friday: Inf. Visualization 60

  19. Table • Data are arranged in columns and row • Data are encoded as text (usually) • They are used also for non quantitative information (just spatial arrangement) 1. Table make easy look up values 2. Tables allow for displaying simple relationships between quantitative and categorical subdivision 3. Table allow for local comparisons 4. Tables provide for high precision 5. Table allow for easy management of different units of measure 61

  20. Choose a table when... • If one of the following is true, a table could be a good choice 1. The report you produce will be used to look up single values 2. It will be used to compare individual values 3. Precise values are required 4. Different units of measure are involved 62

  21. A table with non numerical values 63

  22. Graphs • A graph is a visual display of quantitative information • Quantitative information is encoded visually • More precisely, values are represented and presented on one or more axes • Axes provide scales (quantitative or categorical) 64

  23. Graphs • A graph provides the overall shape of the data • Trend • Outliers • Similarity and differences • Low precision • Not easy look up • Not easy local comparison • Not easy handling of different units 65

  24. Outline • New ideas about good and bad graphs • Meaning of numbers • Tables and graphs • Basic table variations • Basic graph variations • Relationships in graphs 66

  25. Fundamental variation in table design • Relationships in table – Quantitative to categorical – Quantitative to quantitative • Variation in table design – Unidirectional – Bidirectional – Table design solutions 67

  26. Quantitative to categorical relationships 1. 1:1 - One set of quantitative values and one set of categorical subdivisions 2. 1:n - One set of quantitative values and the intersection of multiple categories 3. 1:hn- One set of quantitative values and the intersection of hierarchical categories 68

  27. 1:1 - One set of quantitative values and one set of categorical subdivision nominal 69

  28. 1:n - One set of quantitative values (sales) and the intersection of multiple categories (salespersons & months) nominal + interval (time) 70

  29. 1:hn - One set of quantitative values (sales) and the intersection of hierarchical categories (Product Line -> Family -> Product) Interaction could be a key issue. Interaction? No interaction! 71

  30. Quantitative to quantitative relationships 1. Among one set of quantitative values associated with multiple categorical subdivision 2. Among distinct sets of quantitative values associated with the same categorical subdivision 72

  31. Among one set of quantitative values (sales) associated with multiple categorical subdivision (sales by several salespersons in different months) • Here the focus is the comparison among homogeneous values 73

  32. Among distinct sets of quantitative values (sales, returns, net) associated with the same categorical subdivision (a salesperson) • Here the focus is the comparison among NOT homogeneous values (not the unit but the category) 74

  33. Variation - Unidirectional • Categories are arranged across columns or rows but not in both directions 75

  34. Variation - Unidirectional • Categories are arranged across columns or rows but not in both directions (here we have two categories) 76

  35. Variation - Bidirectional • Categories are on both axes • Such tables are called crosstab or pivot table. 77

  36. Variation - Bidirectional • They save space Unidirectional Bidirectional 78

  37. Graphs • Several components – scales on axes – grid lines – bar – legends – ... • Quantitative values • Categorical subdivision 79

  38. Graphs' variation • The primary source of variation is the choice (or combination) of different components used to encode quantitative values: – point – lines – bars – shapes with 2D area 80

  39. Points • Scatter plot • Points vs lines or bars 81

  40. Points vs lines • Points and lines • Only lines • Use lines only when both axes are numerical or there exists an order (e.g., intervals) 82

  41. Trend line (correlation) 83

  42. Bars • Thickness is not relevant • Thickness must be constant 84

  43. Bars • Do not lie! 85

  44. Bars • Start scale by zero! 86

  45. Shapes with 2D area • Classical pie chart • Part of a larger family of area graphs • Remember its limitations • Where is the scale ? • Our visual perception is not good to accurately assess and compare quantitative values using areas (or worst, slices) So, simply, do not use them at all !! 87

  46. Bargrams (not used in business) 88

  47. Categorical subdivision • Position • Color • Point shape • Fill pattern • Line style 89

  48. Position • X axis 90

  49. Color • We will see perceptual issues about colors... 91

  50. Point shape • Only applicable when points represents quantitative values 92

  51. Position, Color, Point shape 93

  52. Fill pattern mmm, hard to see and causing moirè vibration 94

  53. Moirè vibration use as the last resource 95

  54. Line style 96

  55. Outline • New ideas about good and bad graphs • Meaning of numbers • Tables and graphs • Basic table variations • Basic graph variations • Relationships in graphs 97

  56. Relationships in Graphs • Nominal comparison • Time series • Ranking • Part-to-whole • Deviation • Distribution • Correlation 98

  57. Nominal comparison • Nominal categorical attribute • Quantitative values that are compared each other 99

  58. Nominal comparison • If bars are quite similar it is possible to narrow the quantitative scale removing the zero and focusing on the lowest and highest values • In this case is better to use points (do not lie) 100

Recommend


More recommend