i590 interactive visual analytics
play

I590 Interactive Visual Analytics Week 13 | Nov 16, 2016 Filtering - PowerPoint PPT Presentation

I590 Interactive Visual Analytics Week 13 | Nov 16, 2016 Filtering and Aggregation Models in Visual Analytics Khairi Reda | redak@iu.edu School of Informa5cs & Compu5ng, IUPUI


  1. I590 Interactive Visual Analytics Week 13 | Nov 16, 2016 Filtering and Aggregation Models in Visual Analytics Khairi Reda | redak@iu.edu School of Informa5cs & Compu5ng, IUPUI

  2. http://www.michelecoscia.com/wp-content/uploads/2012/08/demon2.png

  3. Filtering & Aggrega1on • Too much data can overwhelm the visualiza5on • Some5mes we need to show less data points • Filter: eliminate irrelevant items • Aggregate: group similar items

  4. Filter • Any func5on that par55ons the data into two set based on aGributes • Larger / smaller than X • Within a specified geographic extents • Noisy / significant readings • Filtering can also be apply to aGributes, as opposed to the data point themselves Based on a slide by Alex Lex

  5. Filtering with Dynamic Queries Schneiderman

  6. Filtering with menus

  7. Scented Widgets • Provide cues (scent) to the users to aid in filtering and explora5on • Usually come in the form of small visual representa5ons that bind to interface elements Willett 2007, Via Alex Lex

  8. Interac1ve Legends • Provides filtering controls from the legend Riche 2010, Via Alex Lex

  9. Aggrega1on

  10. Histogram • Aggregate items into bins • Display the number of items (i.e., frequency) in each bin

  11. Histogram Number of bins can affect the shape of the histogram Distribution of passengers by Age 10 Bins 20 Bins Based on a slide by Alex Lex

  12. Density plots http://web.stanford.edu/~mwaskom/software/seaborn/tutorial/plotting_distributions.html

  13. Box plots (aka Box-and- Whisker) plots • First quar5le: splits the lowest 25% of the data • Median: splits data into half • Third quar5le: splits the highest 25% of the data http://image.mathcaptain.com/cms/images/106/box-plot.png

  14. Box plots (aka Box-and- Whisker) plots • An alterna5ve representa5on to the min/max is to scale the whiskers by the Interquar5le Range (Q3-Q1) Wikipedia

  15. One box plot, four distribu1ons http://stat.mq.edu.au/wp-content/uploads/2014/05/Can_the_Box_Plot_be_Improved.pdf

  16. Distribu1on, errors bars, and box plots Streit & Gehlenborg, PoV, Nature Methods, 2014 Via Alex Lex

  17. Violin plots http://web.stanford.edu/~mwaskom/software/seaborn/tutorial/plotting_distributions.html

  18. Heatmaps • Aggregate 2D points into 2D bins

  19. Heatmaps (for scaNerplots)

  20. Spa1al Aggrega1on Changing the boundaries / structure of the aggrega5on bins yields different results Based on a slide by Alex Lex

  21. Spa1al Aggrega1on Gerrymandering Based on a slide by Alex Lex

  22. Clustering • Classifica5on of items into “similar” bins • Typically based on a similarity measure • Euclidean distance, Pearson correla5on, etc… • Many different clustering algorithms, with weaknesses and strengths • K-Means • Hierarchical clustering

  23. K-Means • Pick K star5ng points as centroids. Those eventually will comprise the clusters • Calculate distance of every point to centroid, assigning the point to the closest centroid • Update the centroid to the average of the cluster’s members • Repeat

  24. K-Means Limita5ons • Have to pick K • Assump5ons about the data: roughly “circular” clusters of equal size http://stats.stackexchange.com/questions/133656/how-to- understand-the-drawbacks-of-k-means

  25. K-Means Limita5ons http://stats.stackexchange.com/questions/133656/how-to-understand-the-drawbacks-of-k-means

  26. Dimensionality Reduc1on • High-dimensional data: large number of aGributes • Dimensionality reduc5on: Reduce number of dimensions (aGributes) while keeping as much varia5on as possible ANr ANr ANr ANr ANr ANr ANr ANr ANr ANr ANr Item … 1 2 3 4 5 6 7 8 9 10 11 A B C …

  27. Dimensionality Reduc1on • Principle component analysis • Mul5dimensional scaling • And other techniques… ANr ANr ANr ANr ANr ANr ANr ANr ANr ANr ANr Item … 1 2 3 4 5 6 7 8 9 10 11 A B C …

  28. Principle Component Analysis (PCA) • Find a new set of dimensions (axes) that explains the majority of the variance in the data • Order the new dimensions by variance • The first principle component accounts for most variance

  29. Principle Component Analysis (PCA) http://setosa.io/ev/principal-component-analysis/

  30. Mul1dimensional scaling (MDS) • Project the high-dimensional space onto a much lower space (e.g, 2D) • Relies on similarity between points (usually have to compute pairwise similarity between every pair of points) • Non-linear transforma5on: More difficult to interpret than PCA, but can maintain structures beGer in some cases

  31. Models in Visual Analy1cs Adapted from: http://slideplayer.com/slide/4659134/ and from Remo Chang, 2010

  32. Models in Visual Analy1cs • Abstrac5ons of how visualiza5on works: • Provide a way of talking about how humans interact with visualiza5ons • Language for describing different parts of the visual analy5c process • Every model is (overly) simplified: beware!

  33. Terminology / Assump1ons • Sense making: The act of processing incomplete informa5on in order to improve one’s understanding of a situa5on and/or to make decisions • A person’s decision making is bound by [1] • incomplete informa5on • the amount of 5me they have to decide • the finite processing power of their brain • Mental model: An abstracted versions of the real- world that are more tractable [1] H. Simon 1957. “A Behavioral Model of Rational Choice”

  34. Models in Visual Analy1cs

  35. Informa1on Visualiza1on Reference Model Card, Mackinlay, and Scneiderman. Readings in Information Visualization: Using Vision to Think, Morgan Kaufmann, 1999, pp. 17

  36. Van Wijk’s Model D =Data V =Visualiza5on S =Specifica5on I =Image P =Percep5on K =Knowledge E =Explora5on Van Wijk, J. “The value of visualization”, 2005

  37. Keim’s Visual Analy1cs Model Keim, D et al. “Visual Analytics: Definition, process, and challenges”, 2008

  38. Pirolli and Card Sensemaking model Pirolli, P and Card, S. “The sense making process and leverage points for analyst technology as identified through cognitive task analysis”, 2005

  39. Next week Class canceled — Happy Thanksgiving! Week 15 - Nov 30 Time series and temporal data Inference and uncertainty in visualiza5on

Recommend


More recommend