Overview User Centered Design and • My evaluation experience • Why involve users at all? Evaluation • What is a user-centered approach? • Evaluation strategies • Examples from “Snap-Together Visualization” paper 1 2 Empirical comparison of 2D, 3D, and Development and evaluation of a 2D/3D combinations for spatial data Volume visualization interface 3 4 Collaborative visualization on a tabletop Why involve users? 5 6
What is a user-centered Why involve users? approach? • Understand the users and their problems • Visualization users are experts • We do not understand their tasks and • Early focus on users and tasks information needs • Empirical measurement: users’ reactions • Intuition is not good enough and performance with prototypes • Expectation management & Ownership • Iterative design • Ensure users have realistic expectations • Make the users active stakeholders 7 8 Focus on Tasks Focus on Users • Users’ tasks / goals are the driving force • Users’ characteristics and context of – Different tasks require very different visualizations use need to be supported – Lists of common visualization tasks can help • Users have varied needs and experience • Shneiderman’s “Task by Data Type Taxonomy” • Amar, Eagan, and Stasko (InfoVis05) – E.g. radiologists vs. GPs vs. patients – But user-specific tasks are still the best 9 10 Understanding users’ work Design cycle • Field Studies • Design should be iterative - May involve observation, interviewing – Prototype, test, prototype, test, … - At user’s workplace • Surveys – Test with users! • Design may be participatory • Meetings / collaboration 11 12
Key point • Visualizations must support specific users doing specific tasks Evaluation • “Showing the data” is not enough! 13 14 How to evaluate with users? How to evaluate without users? • Quantitative Experiments • Heuristic evaluation Clear conclusions, but limited realism • Cognitive walkthrough – Hard – tasks ill-defined & may be • Qualitative Methods accomplished many ways – Observations • Allendoerfer et al. (InfoVis05) address this issue – Contextual inquiry • GOMS / User Modeling? – Field studies – Hard – designed to test repetitive More realistic, but conclusions less precise behaviour 15 16 Types of Evaluation (Plaisant) Snap-Together Vis • Compare design elements – E.g., coordination vs. no coordination Custom (North & Shneiderman) coordinated • Compare systems views – E.g., Spotfire vs. TableLens • Usability evaluation of a system – E.g., Snap system (N & S) • Case studies – Real users in real settings E.g., bioinformatics, E-commerce, security 17 18
Usability testing vs. Experiment Questions • Is this system usable? Usability testing Quantitative Experiment • Aim: discover knowledge – Usability testing • Aim: improve products • Many participants • Few participants • Results validated • Is coordination important? Does it • Results inform design statistically • Not perfectly replicable • Replicable improve performance? • Partially controlled • Strongly controlled – Experiment to compare coordination vs. conditions conditions no coordination • Results reported to • Scientific paper reports developers results to community 19 20 Critique of Snap-Together Vis Usability of Snap-Together Vis Usability Testing • Can people use the Snap system to + Focus on qualitative results construct a coordinated visualization? + Report problems in detail + Suggest design changes • Not really a research question - Did not evaluate how much training is • But necessary if we want to use the needed (one of their objectives) system to answer research questions • Results useful mainly to developers • How would you test this? 21 22 Summary: Usability testing Controlled experiments • Strives for • Goals focus on how well users – Testable hypothesis perform tasks with the prototype – Control of variables and conditions • May compare products or prototypes – Generalizable results • Techniques: – Confidence in results (statistics) – Time to complete task & number & type of errors (quantitative performance data) – Qualitative methods (questionnaires, observations, interviews) – Video/audio for record keeping 23 24
Controlled conditions Testable hypothesis • Purpose: Knowing the cause of a • State a testable hypothesis difference found in an experiment – this is a precise problem statement –No difference between conditions • Example: except the ideas being studied – (BAD) 2D is better than 3D • Trade-off between control and – (GOOD) Searching for a graphic item among 100 randomly placed similar items will take generalizable results longer with a 3D perspective display than with a 2D display . 25 26 Confounding Factors (1) Confounding Factors (2) • Group 1 • Participants perform tasks with Visualization A in a room with windows Visualization A followed by • Group 2 Visualization B. Visualization B in a room without windows What can we conclude if task time is faster with Visualization A? What can you conclude if Group 2 performs the task faster? 27 28 Confounding Factors (3) What are the confounding factors? • Do people remember information better with 3D or 2D displays? • Participants randomly assigned to 2D or 3D • Instructions and experimental conditions the same for all participants 2D Visualization 3D Visualization Tavanti and Lind (Infovis 2001) 29 30
Order Effects What is controlled Example: Search for circles among • Who gets what condition squares and triangles in – Subjects randomly assigned to groups Visualizations A and B • When & where each condition is given • How the condition is given 1.Randomization – Consistent Instructions • E.g., number of distractors: 3, 15, – Avoid actions that bias results (e.g., 6, 12, 9, 6, 3, 15, 9, 12… “Here is the system I developed. I think 2.Counter-balancing you’ll find it much better than the one you just tried.”) • E.g., Half use Vis A 1 st , half use Vis B first • Order effects 31 32 Statistical analysis Experimental Designs • Apply statistical methods to data analysis Between- Within- – confidence limits : subjects subjects •the confidence that your conclusion is No order + - correct effects? •“p = 0.05” means: Participants - + –a 95% probability that there is a true can compare difference conditions? –a 5% probability the difference occurred by chance Number of Many Few participants 33 34 Types of statistical tests Snap-Together Vis Experiment • Are both coordination AND visual • T-tests (compare 2 conditions) overview important in overview + • ANOVA (compare >2 conditions) detail displays? • Correlation and regression • Many others • How would you test this? 35 36
Critique of Snap-Together Vis How should evaluation change? Experiment + Carefully designed to focus on factors • Better experimental design of interest – Especially more meaningful tasks - Limited generalizability. Would we get • Fewer “Compare time on two systems” the same result with non-text data? experiments Expert users? Other types of • Qualitative methods coordination? Complex displays? • Field studies with real users - Unexciting hypothesis – we were fairly sure what the answer would be 37 38 Take home messages • Talk to real users! • Learn more about HCI! 39
Recommend
More recommend