1 FacetAtlas: Multifaceted Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz Shixia Liu, Huamin Qu InfoVis 2010
2 Introduction
3 multiple facets
4 Symptoms Treatments multiple facets Causes Tests & Diagnosis Prognosis Prevention Complications
5 Diabetes
6 Type2 Metabolic Diabetes Syndrome Type1 Gestational Diabetes
7 Type2 Metabolic Diabetes Syndrome Type1 Gestational Diabetes How to visualize the relations of multifaceted document contents?
8 Type2 (Q1) How to model the document contents into Metabolic Diabetes multifaceted relation Syndrome data? (Q2) How to intuitively Type1 visualize multifaceted document contents and their relations? (Q3) How to find the Gestational insight patterns visually Diabetes driven by users’ interests?
9 Solution • Goal : – Visualize both the global (clusters) and local (relations) patterns in rich text corpora with multiple facets . • Approach : – Multifaceted entity-relational data model – Intuitive visual encoding and automatic layout – User s’ interests driven interaction for pattern detection
10 Demo
11 Key Challenges (Q1) How to model the document contents into multifaceted relation data? (Q2) How to intuitively visualize multifaceted document contents and their relations? (Q3) How to find the insight patterns visually driven by users’ interests?
12 (Q1) How to model the document contents into multifaceted relational data ? facet document set segmentation entity extraction entity set multifaceted entity relational data model type 1 type 2 diabetes diabetes Internal disease thirst relations blurred symptom vision treatment take blood sugar medications control External relations
13 Key Challenges (Q1) How to model the document contents into multifaceted relation data? (Q2) How to intuitively visualize multifaceted document contents and their relations? (Q3) How to find the insight patterns visually driven by users’ interests?
14 (Q2) How to visualize multifaceted document contents and their relations? 2 1 1, 2 1, 2 3, 4 <4, 2> <1, 2> <4, 3> <1, 3> <5, 3> <5, 1> 3, 4 2, 3 5, 6 4 data model encoding layout
15 (Q2) How to visualize multifaceted document contents and their relations? 2 1 1, 2 1, 2 3, 4 <4, 2> <1, 2> <4, 3> <1, 3> <5, 3> <5, 1> 3, 4 2, 3 5, 6 4 data model encoding layout
16 Encoding Multifaceted Entity Relational Model 16
17 Encoding 1 disease 1 2 Type 1 Diabetes 2 3 4 3 Type 2 Diabetes 5 4 6 Multifaceted Entity Relational Model 17
18 Encoding External relations Multifaceted entities symptoms treatments 1 disease 1 2 Type 1 Diabetes 2 3 Internal relations 4 3 Type 2 Diabetes 5 4 6 Multifaceted Entity Relational Model 18
19 Encoding External relations Group entities by external relations Multifaceted entities symptoms treatments 1 disease 1 Group internal 2 relations Type 1 Diabetes 2 3 Internal relations 4 3 Type 2 Diabetes 5 4 6 Multifaceted Entity Relational Model 19
20 Encoding Encoded external relation between disease facet and symptom facet Facet Node 1. Encode external relations by treatments neighborhood 2. Split overlap 1 Grouped internal 1, 2 entities into relation 3, 4 multiple replicas Type 1 Diabetes 2 <1, 5> 3. Group related <2, 4> entities and their replicas in into <3, 4> Overlapped the facet node <3, 5> 3 entities has multiple replicas Type 2 4. Grouping the Diabetes 3, 4 related internal 5, 6 linkages in the 4 symptom facet
21 Encoding treatments Symptom disease facet node 1, 2 3, 4 1, 2 1. Similarly groups Type 1 the treatments Diabetes <1, 5> entities into the <2, 4> <1, 2> treatment facet <3, 4> <1, 3> node <3, 5> 2. Then we encoded the Type 2 data model into Diabetes 3, 4 2, 3 visual form 5, 6 4
22 (Q2) How to visualize multifaceted document contents and their relations? 2 1 1, 2 1, 2 3, 4 <4, 2> <1, 2> <4, 3> <1, 3> <5, 3> <5, 1> 3, 4 2, 3 5, 6 4 data model encoding layout
23 Layout 10,000 entities and 30,000 external relations 23
24 Layout sampling entity layout density estimation link layout 24
25 Layout sampling entity layout density estimation link layout Sampling by DOI offline online facet document set segmentation entity extraction disease symptom treatment related samples query build indices
26 Layout sampling entity layout density estimation link layout Stabilized Layout Based on the hidden internal relations of primary facet Keep users’ mental map while data changed 1 2 2 min X X d X pre ( X ) i j ij i i 2 d i j i j ij Cluster Together More smoothly
27 Layout Cluster Layout sampling entity layout density estimation link layout Kernel Density RNN Estimation
28 Layout sampling entity layout density estimation link layout Link Layout (1) Layout external relations rotating swapping
29 Layout sampling entity layout density estimation link layout Link Layout (2) graph partition edge bundling
30 Fever
31 Diabetes
32 HIV
33 HIV Where are our patterns? What can we find ?
34 Key Challenges (Q1) How to model the document contents into multifaceted relation data? (Q2) How to visualize multifaceted information to reveal both global and local patterns? (Q3) How to find the insight patterns visually driven by users’ interests?
35 (Q3) How to find insights via user interactions? Symptom view Disease view context switch Keyword Query Context Switch Filtering Highlighting filtering A set of interactions are designed to address users’ interests
36 Visual Patterns Symptoms of HIV • Global cluster patterns • Local multifaceted relational pattern – Co-occurrences pattern – Outlier pattern Headache Fatigue Fever Shortness of Breath Outlier Co-occurrence
37 (Q3) Interview of domain experts What did domain experts (3 physicians) say? “enhance the current thought process of physicians, and help create the subtle associations between different concepts .” “this will be very helpful for nurses who run the self-care education activities to better engage patients .” “this tool has great potential as an education tool for interns and residents who have just started their medical career” “extremely creative and has great potential for clinical therapeutic usage and diagnosis decision support ”
38 Summary • Problem : How to visualize relations of multifaceted document contents ? Global / Local patterns • Approach : • Result :
39 FacetAtlas: Multifaceted Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz, Shixia Liu, Huamin Qu InfoVis 2010
40 Related Work Visualizing Global Content Patterns S. Havre, et al. H. Strobelt, et al. InfoVis 2000 Tag Cloud InfoVis 09 Visualizing Local Relational Patterns F. van Ham, et al. M. W. Christopher, et al. A. Pere, et al. InfoVis 2009 Vast 2009 InfoVis 2006 Search Interface F. van Ham, et al. G. Smith Grokker InfoVis 2009 TVCG 2006
41 Related Work Visualizing Global Content Patterns S. Havre, et al. H. Strobelt, et al. InfoVis 2000 Tag Cloud InfoVis 09 Our Focus : Extract complex relations from document contents Visualizing Local Relational Patterns by considering F. van Ham, et al. M. W. Christopher, et al. A. Pere, et al. InfoVis 2009 Vast 2009 different aspects InfoVis 2006 Search Interface F. van Ham, et al. G. Smith Grokker InfoVis 2009 TVCG 2006
42 Evaluations 42
43 User study • Participants – 3 domain experts (2 physicians with 30 years experience in the healthcare domain, and 1 young medical professional) – 20 common users without medical background (2 groups and 10 for each) • 6 study tasks based on the Google Health online documents – T4 : identify the facet with the most cross-cluster connections. – T6 : identify the facet with the most overall connection across entities. • Baseline – Enhanced Traditional Graph Visualization – Based on the same framework with similarly interactions on the same dataset 43
44 Evaluation Results from non-experts Complete Time surveys Task Success Rate Result (based on two tail t-test) • Significant efficiency improvement in – Visualizing the clusters – Showing an overview of multiple connections across clusters – Representing the details of multifaceted connection between entities • Slight improvement in – Finding the most connective facet within a cluster
Recommend
More recommend