Visualization (vis) defined & motivated Why use an external representation? Why represent all the data? Lectures 1&2: Computer-based visualization systems provide visual representations of datasets Computer-based visualization systems provide visual representations of datasets Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively. designed to help people carry out tasks more effectively. designed to help people carry out tasks more effectively. Manipulate & Interact • summaries lose information, details matter Visualization is suitable when there is a need to augment human capabilities • external representation: replace cognition with perception rather than replace people with computational decision-making methods. –confirm expected and find unexpected patterns –assess validity of statistical model • human in the loop needs the details & no trusted automatic solution exists Anscombe’s Quartet –doesn't know exactly what questions to ask in advance Tamara Munzner –exploratory data analysis Identical statistics Department of Computer Science • speed up through human-in-the-loop visual data analysis x mean 9 University of British Columbia –present known results to others x variance 10 –stepping stone towards automation y mean 7.5 DSCI 532, Data Visualization 2 y variance 3.75 –before model creation to provide understanding Week 1, Jan 2 / Jan 4 2018 x/y correlation 0.816 –during algorithm creation to refine, debug, set parameters https://www.youtube.com/watch?v=DbJyPELmhJc www.cs.ubc.ca/~tmm/courses/mds-viz2-17 @tamaramunzner [Cerebral: Visualizing Multiple Experimental Conditions on a Graph –before or during deployment to build trust and monitor with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEE Same Stats, Different Graphs TVCG (Proc. InfoVis) 14(6):1253-1260, 2008.] 2 3 4 Why focus on tasks and effectiveness? What resource limitations are we faced with? Nested model: Four levels of vis design Why is validation difficult? [A Nested Model of Visualization Design and Validation. • domain situation • different ways to get it wrong at each level Computer-based visualization systems provide visual representations of datasets Vis designers must take into account three very different kinds of resource limitations: Munzner. IEEE TVCG 15(6):921-928, 2009 designed to help people carry out tasks more effectively. those of computers, of humans, and of displays. (Proc. InfoVis 2009). ] – who are the target users? • abstraction • effectiveness requires match between data/task and representation • computational limits domain – translate from specifics of domain to vocabulary of vis Domain situation –set of representations is huge abstraction –processing time You misunderstood their needs • what is shown? data abstraction –many are ineffective mismatch for specific data/task combo –system memory • why is the user looking at it? task abstraction Data/task abstraction –increases chance of finding good solutions if you understand full space of possibilities • human limits idiom You’re showing them the wrong thing • idiom • what counts as effective? algorithm –human attention and memory – how is it shown? Visual encoding/interaction idiom –novel: enable entirely new kinds of analysis • display limits The way you show it doesn’t work • visual encoding idiom: how to draw –faster: speed up existing workflows –pixels are precious resource, the most constrained resource • interaction idiom: how to manipulate Algorithm [A Multi-Level Typology of Abstract Visualization Tasks • how to validate effectiveness Your code is too slow – information density : ratio of space used to encode info vs unused whitespace • algorithm Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ] –many methods, must pick appropriate one for your context • tradeoff between clutter and wasting space, find sweet spot between dense and sparse – efficient computation 5 6 7 8 Why? What? Why is validation difficult? Types: Datasets and data Datasets Attributes Actions Targets What? Data Types Attribute Types • solution: use methods from different fields at each level Dataset Types All Data Analyze Categorical Items Attributes Links Positions Grids Consume Trends Outliers Features Spatial Tables Networks Net Data and Dataset Types Discover Present Enjoy Why? Ordered Tables Networks & Fields Geometry Clusters, Domain situation problem-driven Fields (Continuous) Geometry (Spatial) Trees Sets, Lists Attributes (columns) Ordinal anthropology/ Observe target users using existing tools work Items Items (nodes) Grids Items Items Attributes ethnography Link Produce Items Attributes Links Positions Positions Grid of positions How? One Many Quantitative (rows) Attributes Attributes Annotate Record Derive Data/task abstraction Node Distribution Dependency Correlation Similarity tag (item) Cell Dataset Types Position Cell containing value Ordering Direction Node Visual encoding/interaction idiom Tables Networks Fields (Continuous) Extremes design Sequential em) Search Justify design with respect to alternatives Attributes (columns) Grid of positions Attributes (columns) Attribute Types Items Link • {action, target} pairs Target known Cell Target unknown (rows) Diverging computer Node Algorithm Categorical Location technique-driven (item) Value in cell Lookup Browse Network Data Cell containing value Attributes (columns) – discover distribution Measure system time/memory known science work Value in cell Topology Cyclic Location Analyze computational complexity Multidimensional Table Trees – compare trends Locate Explore unknown cognitive Analyze results qualitatively Ordering Direction –l ocate outliers psychology Ordered Query Paths Measure human time with lab experiment ( lab study ) Value in cell – browse topology Identify Compare Summarize Sequential Diverging Cyclic anthropology/ Observe target users after deployment ( ) What? Ordinal Quantitative Geometry (Spatial) Dataset Availability ethnography Spatial Data Measure adoption Why? Static Dynamic Shape Position [A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] 10 11 12 9 How? Actions: Analyze, Query Derive Analysis example: Derive one attribute Why: Targets Analyze Consume • Strahler number • analyze • don’t just draw what you’re given! Discover Present Enjoy – centrality metric for trees/networks All Data Network Data –consume –decide what the right thing to show is – derived quantitative attribute Trends Outliers Features • discover vs present –create it with a series of transformations from the original dataset Topology – draw top 5K of 500K for good skeleton – aka explore vs explain –draw that [Using Strahler numbers for real time visual exploration of huge graphs. Auber. • enjoy Proc. Intl. Conf. Computer Vision and Graphics, pp. 56–69, 2002.] Produce • one of the four major strategies for handling complexity – aka casual, social Paths Annotate Record Derive Task 1 Task 2 –produce Attributes tag .74 .74 .58 .58 exports • annotate, record, derive .64 .64 One Many .84 .84 .54 .54 .74 .84 .74 .84 • query imports Spatial Data .84 .84 .24 .24 Dependency Correlation Similarity Query Distribution .64 .64 .94 .94 trade Shape In Out In In Out –how much data matters? Identify Compare Summarize balance + Tree Quantitative Tree Quantitative Filtered Tree attribute on nodes attribute on nodes Removed • one, some, all Extremes unimportant parts • independent choices trade balance = exports − imports What? Why? Why? How? What? In Tree In Tree Summarize Reduce Derive –analyze, query, (search) Derived Data Out Quantitative In Quantitative attribute on nodes Topology Filter Original Data 13 14 attribute on nodes 15 16 Out Filtered Tree
Recommend
More recommend