Machine Learning for Precision Public Health: Vi Visual ualizing ng Data a for Anal Analysis and and Communi unication Anamaria Crisan Vanier Canada Scholar & UBC Public Scholar PhD Candidate, Computer Science University of British Columbia @amcrisan acrisan@cs.ubc.ca http://cs.ubc.ca/~acrisan
PhD Ph Ma Master of Science (C (Computer Science) ( ( Bioinformat atics ) 2008 2008 2010 2010 2013 2013 2015 2015 Br British Columbi bia Centre for for Disease Con ontrol ol Ph PhD Candidate, Comp omputer Science Un University of British Columbia GenomeDX Ge Biosciences Bi
What we’ll talk about
Why should we visualize data? How should we visualize data? What datavis tools are available?
Why should we visualize data?
Translating Numbers to Words It is not always easy to reason consistently with numbers http://bit.ly/1FxtT2z
Data Visualization is a Powerful Medium Least Understandable Most Understandable Visualization Probability Frequency < < 60% 6 in 10 Whiting (2015) “How well do health professionals interpret diagnostic information? A systematic review”
Role of data visualization in the current paradigm = Communication of scientific research
Inform Do all the Yes. Science! the public ! Do you have a research Duh. Problem? But No. eventually you’ll have a problem right? https://www.ratbotcomics.com/comics/pgrc_2014/1/1.html
Inform Do all the Yes. Science! the public ! Infographics Do you are pretty have a research Duh. Problem? Maybe data Visualization? But No. eventually you’ll have a problem right?
Inform Do all the Yes. Science! the public ! Infographics Do you are pretty have a research Duh. Problem? Maybe data Did it Visualization? work? But No. eventually you’ll have a problem right?
Inform Do all the Yes. Science! the public ! Different Infographics? Do you have a research Duh. Problem? No : ( Maybe data Did it Visualization? work? But No. eventually you’ll have a problem right?
Inform Do all the Yes. Science! the public ! Different Infographics? Do you have a research Duh. Problem? No : ( Maybe data Did it Visualization? work? But No. eventually you’ll have Yes! a problem (maybe?) Declare Victory right?
Limitation #1 : Missed Opportunity in Exploration Inform Do all the Science! the public ! Missed Opportunity for Exploration Exploration is looking at your data, § trying different analysis methods, assessing if there are outliers or missing data etc. Data Visualization!
Limitation #1 : Missed Opportunity in Exploration Same stats, different graphs Autodesk Research (2017). Same Stats, Different Graphs: https://www.autodeskresearch.com/publications/samestats
Limitation #1 : Missed Opportunity in Exploration Same stats, different graphs ( Datasaurus) Autodesk Research (2017). Same Stats, Different Graphs: https://www.autodeskresearch.com/publications/samestats
Limitation #1 : Missed Opportunity in Exploration Opening up the machine learning black box
Limitation #1 : Missed Opportunity in Exploration Chi Chihua huahua hua or muf uffin? n? Mo Mop or sheep dog?
Limitation #1 : Missed Opportunity in Exploration Goodfellow (2014). “Explaining and Harnessing Adversarial Examples”
Example : Trying to understand the black box Made with : JavaScript Ma Olah (2018). “Building blocks of interpretability” (https://distill.pub/2018/building-blocks/)
Health data are complex to analyze and visualization
Limitations #2 : Identifying the Appropriate Vis Selecting the appropriate data visualization is challenging True for exploration & § Data communication applications Visualization!
Visualization Design ALSO matters
Example: Communicating Survival Benefit of Cancer Therapy Baseline Visualization Alternative 1 Alternative 2 Zikmund-Fisher (2013). A demonstration of ''less can be more'' in risk graphics.
Example: Visualizing Arteries of the Heart for Surgery Planning Ma Made with : Processing Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis”
Example: Visualizing Arteries of the Heart for Surgery Planning EXISTI EX TING STANDARD RD REV REVISED ED VISUALIZATI TION Ac Accuracy : 39% 39% Ac Accuracy: 91% 91% Ma Made with : Processing Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis”
There are two aspects of visualizations to think about: Is Is it the appr appropr priat ate vi visualization? Ho How w do you ma make a a visual alizat ation? What datavis tools are available? How should we visualize data?
How should we visualize data ?
Cross Cutting Disciplines in Information Visualization Human Perception & Cognition Computer Graphics Data Analysis Visualization Design & Analysis
Encoding and Decoding Information R. Kosara (EagerEyes) – https://eagereyes.org/basics/encoding-vs-decoding
Putting it all Together for Visualization Design & Analysis § Non-trivial to condense knowledge across all these areas § Still an ongoing area of research § I will try convey a simpler intuition about design & analysis
Guiding Principles for Visualizing your Data Image Source: Valentin Antonucci via Pexels
Breaking Down a Visualization in Three Questions Why? (Motivation) Why do you need to visualize data? How will you, or others, use the visualization? 34 34
Breaking Down a Visualization in Three Questions Why? (Motivation) Why do you need to visualize data? How will you, or others, use the visualization? What? (Data & Tasks) What kind of data is being visualized? What tasks are performed with the data? 35 35
Breaking Down a Visualization in Three Questions Why? (Motivation) Why do you need to visualize data? How will you, or others, use the visualization? What? (Data & Tasks) What kind of data is being visualized? What tasks are performed with the data? How? (Visual & Interactive Design) People tend to jump to this level and How do you make the visualization? ignore why and what Is it the right visualization? 36 36
Design & Evaluation with Three Questions Design Evaluation Why? Why Does the visualization address the the intended need? What? Wha Are you using the right data, or deriving the right data? Does the visualization support the tasks using that data? How? How? Are the visual & interactive choices appropriate for the data and tasks? If interactive / computer based, is the visualization easy to use and reliable (i.e doesn’t crash all the time) 37 37
Ideas from the research literature : the nested-model Why Why? Wha What? Design How? How? Evaluation T. Munzner (2014) – Visualization Design and Analysis
Steps to Systematic Thinking in Data Visualization Image Source: Valentin Antonucci via Pexels
Thinking Systematically about Data Visualization Infovis (Information Visualization) research advocates an ite tive process itera rativ Design Da Data Visual + Interaction Vi Domain Do Al Algorithm + Task + De Design Ch Choices Problem* Pr Evaluation *Domain Problem = Motivation T. Munzner (2014) – Visualization Design and Analysis
An Iterative Process An iterative approach to development allows us to get feedback before committing to ineffective design choices
Thinking Systematically about Data Visualization Da Data Visual + Interaction Vi Do Domain Algorithm Al + Task + De Design Ch Choices Problem Pr 1. Identify a relevant pr blem that effects you or a group probl of stakeholders T. Munzner (2014) – Visualization Design and Analysis
Public Health Stakeholders § Mu Multidisciplinary decision making teams § More data & diverse data types = more informed decision making § BUT – different stakeholder abilities to interpret data & different needs Me Medical Community Co He Health Cl Clinicians Policy Makers Pol Nu Nurses Officers Of Lead Le aders Researchers Re Pa Patients
Thinking Systematically about Data Visualization Da Data Vi Visual + Interaction Do Domain Algorithm Al + Task + De Design Ch Choices Problem Pr 2. Ask what data stakeholders use (is it available)? 3. Ask what stakeholders do with the data [ tasks ] T. Munzner (2014) – Visualization Design and Analysis
Data - Many Different Types of Data! T. Munzner (2014) – Visualization Design and Analysis
Data - Don’t Just Visualize the Raw Data! Example when this advice is ignored Example Original (Raw) Data Derived Data T. Munzner (2014) – Visualization Design and Analysis XKCD
Tasks - How People Use the Data Geographic Overview of Prostate Cancer Individual Prostate Cancer Risk § Useful for epidemiologists and policy makers § Good for patients and doctors § Supports surveillance tasks § Supports treatment decision making tasks Source : http://riskcalc.org/PCPTRC/ (UT San Antonio) Source : Atlanta CDC
Tasks - How People Use the Data • Tasks can also change how the same data should be visualized • Example: representing US electoral collage results Standard Map Cartogram
Recommend
More recommend