visual analytics and information retrieval
play

Visual Analytics and Information Retrieval Giuseppe Santucci - PowerPoint PPT Presentation

Visual Analytics and Information Retrieval Giuseppe Santucci Dipartimento di Informatica e Sistemistica Sapienza Universit di Roma santucci@dis.uniroma1.it Who am I? (University of Rome is so big) VisDis and the Database & User


  1. Visual Analytics and Information Retrieval Giuseppe Santucci Dipartimento di Informatica e Sistemistica Sapienza Università di Roma santucci@dis.uniroma1.it

  2. Who am I? (University of Rome is so big…) • VisDis and the Database & User Interface groups are two tightly connected research groups at the Department of Computer and System Science (32 full professors, 19 associate ,and 13 assistant professors) of Rome Faculty of Engineering & ICT ? • The VisDis and the Database/Interface group background is about: – Visual Information Access – Data quality – Data integration – User Centered Design – Usability and Accessibility – Infovis evaluation – Visual quality metrics – Visual Analytics • Data sampling • Density map optimization – Information Retrieval (&VA) Fire 2012, Kolkata VA & IR - Giuseppe Santucci 2 19 December 2012

  3. Outline • Information Visualization – Definitions – Main issues • Data overloading – Visual Analytics – Visual Analytics challenges • One methodological examples • VA and Information Retrieval • Demo Fire 2012, Kolkata VA & IR - Giuseppe Santucci 3 19 December 2012

  4. Information Visualization? • Old stuff… Fire 2012, Kolkata VA & IR - Giuseppe Santucci 4 19 December 2012

  5. Visualization for Problem Solving • Mystery: what is causing a cholera epidemic in London in 1854? Fire 2012, Kolkata VA & IR - Giuseppe Santucci 5 19 December 2012

  6. Visualization for Problem Solving Illustration of Dr. John Snow (1854) Dots indicate location of deaths X indicate the location of water pumps [From Visual Explanations by Edward Tufte, Graphics Press, 1997] Fire 2012, Kolkata VA & IR - Giuseppe Santucci 6 19 December 2012

  7. Visualization for Problem Solving The actual John Snow pub in London close to the water pump !!! B.T.W., workers at the nearby brewery were Dr. Snow deducted that the cholera epidemic noted to be relatively was caused by a contaminated water pump !!! free of cholera… Closing that pump quickly solved the problem 7

  8. Visualization for Explaining What happened during the Napoleon’s Russian Campaign? Fire 2012, Kolkata VA & IR - Giuseppe Santucci 8 19 December 2012

  9. The Charles Joseph Minard’s map (1861) Fire 2012, Kolkata VA & IR - Giuseppe Santucci 9 19 December 2012

  10. Visualization for Making decision Traveling in London by underground How can I get Queens Park from Victoria station? Fire 2012, Kolkata VA & IR - Giuseppe Santucci 10 19 December 2012

  11. London Underground Map 1927 Fire 2012, Kolkata VA & IR - Giuseppe Santucci 11 19 December 2012

  12. The Harry Beck’s idea • Real position (when traveling in underground) does not matter • Only station sequences matter together with their connections • Beck proposed a “distorted” map • Actually all the underground maps in the world follow the Beck’s approach • He got a little payment (London underground was not sure about the idea) • Still true right now: infovis people do not become rich… • Likely that holds for VA and IR as well � Fire 2012, Kolkata VA & IR - Giuseppe Santucci 12 19 December 2012

  13. London Underground Map 1990s Fire 2012, Kolkata VA & IR - Giuseppe Santucci 13 19 December 2012

  14. Moving to the present time • What is modern Information Visualization ? • First of all, what is Visualization ? • Visualize: to form a mental model or mental image of something • It is a cognitive activity and it has nothing to do with computers Fire 2012, Kolkata VA & IR - Giuseppe Santucci 14 19 December 2012

  15. What is Information Visualization? Information visualization is the use of computer- supported , interactive , visual representations of abstract data to amplify cognition . [Card et al. ‘99] Fire 2012, Kolkata VA & IR - Giuseppe Santucci 15 19 December 2012

  16. Information visualization ! 1. Infovis is perfect for exploration, when we don’t know exactly what to look at. It supports vague goals 2. Infovis is perfect to explain complex data and to support decisions • Other approaches to data analysis – Statistics: strong verification but does not support exploration and vague goals – Data mining: actionable and reliable but black box, not interactive, question-response style – Visual Analytics (formerly Visual Data Mining) is trying to join the two worlds

  17. …computer supported and interactive • Computer-supported – Yes we use computers, but we have to always remember that a cognitive activity is involved in the process • Interactive – To exploit the full power of Infovis techniques interaction is mandatory. Fire 2012, Kolkata VA & IR - Giuseppe Santucci 17 19 December 2012

  18. Interaction example • Agronomists are experimenting 7 treatments (anti-parasite, fertilizer, etc.) on 10 different crops (corn, tomatoes, etc.) • A black square indicates success Treatments • Does this visualization help? A B C D E F G 1 Re 2 3 4 Crops 5 6 7 8 9 10 Fire 2012, Kolkata VA & IR - Giuseppe Santucci 18 19 December 2012

  19. Interaction example • Let’s rearrange the rows Treatments Treatments A B C D E F G A D C E G B F 1 1 Rearrange 2 3 3 8 4 2 Crops Crops 5 6 6 10 7 4 8 7 (10! � , VA can help…) 9 9 10 5 Fire 2012, Kolkata VA & IR - Giuseppe Santucci 19 19 December 2012

  20. …it is about abstract data • Abstract data – Information visualization deals with images that does not refer to physical situation . In other words it is NOT scientific visualization/geographic visualization • Scientific visualization primarily relates to and represents something physical or geometric • Examples – Air flow over a wing – Weather over USA – Torrents inside a tornado – Organs in the human body – Molecular bonding… Fire 2012, Kolkata VA & IR - Giuseppe Santucci 20 19 December 2012

  21. Scientific/geographic visualization Earthquake intensity Fire 2012, Kolkata VA & IR - Giuseppe Santucci 21 19 December 2012

  22. …abstract data • Items that do not have a direct physical/visual correspondence • Examples: sport statistics, stock trends, query results, software data, IR metrics, etc… • Items are represented on a 2D / 3D physical space using their numerical characteristics (attributes) • The visualization is useful for analysis and decision-making (not just for fun or colors) • E.g. : Postal parcels – Shipping date – Volume – Weight – Sender country – Receiver country – … Fire 2012, Kolkata VA & IR - Giuseppe Santucci 22 19 December 2012

  23. Abstract data A 2D Scatterplot showing about 200.000 postal parcels Fire 2012, Kolkata VA & IR - Giuseppe Santucci 23 19 December 2012

  24. Mixed visualization Byte traffic into the ANS/NSFNET T3 backbone in 1993

  25. Amplify cognition using the human vision • Highest bandwidth human sense • Fast, parallel • Pattern recognition • Extends memory and cognitive capacity • People think visually (I see… means also I understand in most languages) • Amplify cognition • Pre-attentive (we use only the eyes, not the brain) • Two quick examples (4 seconds each)

  26. Three simple questions

  27. The quick answers

  28. One (very) simple question • How many 3s here ? • You have 4 seconds… 458757626808609928083982698028 747976296262867897187743671947 746588786758967329667287682085

  29. So ? • Time was not enough? • You can do that in less than 0.2 seconds ! • Let’s try a different visualization…

  30. • Color is pre-attentive (pops up) • No cognitive effort is required • A lot of issues are already clear • Most of people ignore them... • It is not enough to use wrist and bells

  31. Canonical steps in Infovis – STEP 1 Internal DATA Representation Mathematics Sport Physics Encoding of values Chemistry Literature Univariate data History Art Bivariate data Geography Trivariate data Multidimensional data Encoding of relationships Temporal data Map & Diagrams Graphs/Trees Data streams Fire 2012, Kolkata VA & IR - Giuseppe Santucci 31 19 December 2012

  32. Canonical steps in infovis – STEP 2 Internal Representation Space limitations Scrolling Presentation Overview + details Distortion Suppression Zoom & pan Semantic zoom Time limitation Perceptual issues Cognitive issues Fire 2012, Kolkata VA & IR - Giuseppe Santucci 32 19 December 2012

  33. Problem solved! We have (∼) agreed and ( ∼ ) mature solutions for Presentation Representation of a large variety of data So I’m done! Questions ? Fire 2012, Kolkata VA & IR - Giuseppe Santucci 33 19 December 2012

  34. Data size and complexity ! • 100 million FedEx transactions per day • 150 million VISA credit card transactions per day • 300 million long distance ATT calls per day • 50 billion e-mails per day • 600 billion IP packets per day • 1 trillion (10 12 ) of web pages (according to Google), corresponding to about 3 petabytes of data • Google processes 20 petabytes of data per day Fire 2012, Kolkata VA & IR - Giuseppe Santucci 19 December 2012

Recommend


More recommend