querying and creating visualizations by analogy
play

Querying and Creating Visualizations by Analogy Carlos E. - PowerPoint PPT Presentation

Querying and Creating Visualizations by Analogy Carlos E. Scheidegger, Huy T. Vo, David Koop, Juliana Freire, Cludio T. Silva SCI Institute, School of Computing University of Utah Outline Provenance reuse We have all this rich


  1. Querying and Creating Visualizations by Analogy Carlos E. Scheidegger, Huy T. Vo, David Koop, Juliana Freire, Cláudio T. Silva SCI Institute, School of Computing University of Utah

  2. Outline • Provenance reuse • We have all this rich metadata - let’s use it • Query-by-example • Visualization by Analogy • (VisTrails intro) • Transparent provenance tracking www.vistrails.org

  3. Related Work • Visualization Systems and Libraries • AVS, DX, SCIRun, VTK • History tracking and formalisms • Jankun-Kelly et al’s pset-calculus • Kreuseler et al, VDM history • Brodlie’s et al’s GRASPARC • VisTrails www.vistrails.org

  4. Provenance • The “pedigree” of an artifact • Where did it come from? Who held it? www.vistrails.org

  5. Provenance in VisTrails • Process provenance • How was this visualization created? www.vistrails.org

  6. Version Tree • Persistent • Transparent • Reuse • Can we do better than just presenting? www.vistrails.org

  7. Why not query languages? www.vistrails.org

  8. Why not query languages? wf{*}: upstream(x) union x where x.module = “SoftMean” and executed (x) and y in upstream(x) and y.module = “AlignWarp” and y.parameter(“model”) = “12” www.vistrails.org

  9. Why not query languages? This is still only mildly better than straight SQL... Does not expose mapping to relational schema wf{*}: upstream(x) union x where x.module = “SoftMean” and executed (x) and y in upstream(x) and y.module = “AlignWarp” and y.parameter(“model”) = “12” www.vistrails.org

  10. Query-by-Example • Do not teach the user new forms of interaction! www.vistrails.org

  11. Visualization by Analogy • Create new visualizations by saying “do as they did” • Specify what , not how www.vistrails.org

  12. Query-by-Example • Trivially reducible from MAX-CLIQUE • ... and MAX-CLIQUE is NP-Complete • ... and MAX-CLIQUE is fundamentally hard to approximate • Solution: algorithm tailored to problem domain www.vistrails.org

  13. Query-by-Example • Split every subgraph in topologically sorted layers • Ok, since all pipelines are DAGs in VisTrails 1 2 3 www.vistrails.org

  14. Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 2 3 www.vistrails.org

  15. Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 1 2 2 3 3 4 5 Match www.vistrails.org

  16. Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 1 1 2 2 2 3 3 4 3 4 5 Match No match www.vistrails.org

  17. Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 1 1 1 2 2 2 2 3 3 3 4 3 4 5 Match No match No match www.vistrails.org

  18. Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 2 3 • Not too harmful - most modules cannot connect to one another www.vistrails.org

  19. Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 1 2 2 3 3 4 5 • Not too harmful - most modules cannot connect to one another www.vistrails.org

  20. Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 1 1 2 2 2 3 3 3 4 4 5 5 • Not too harmful - most modules cannot connect to one another www.vistrails.org

  21. Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 5 5 5 • Not too harmful - most modules cannot connect to one another www.vistrails.org

  22. QBE Demo www.vistrails.org

  23. Vistrail diffs • A version tree stores a set of actions • Each action is a function on the set of all possible visualizations: V → V • a n ◦ a n − 1 ◦ a n − 2 · · · ◦ a 0 • We can use those to determine the difference between visualizations • Moving up, then down the version tree www.vistrails.org

  24. Vistrail diffs a 0 a 2 a 1 a 3 A B a 3 ◦ a 2 ◦ a − 1 ◦ a − 1 • Action to go from A to B is 0 1 www.vistrails.org

  25. Visualization by Analogy • A diff is a template: reapply it elsewhere • How do we match two pipelines? www.vistrails.org

  26. Algorithm Overview • Compute the difference δ ab = ∆( p a , p b ) • Compute the map map ac = map ( p a , p c ) • Apply to δ � cb = map ac ( δ ab ) δ ab map ac p d = δ � • Compute the new pipeline cb ( p c ) www.vistrails.org

  27. Visualization by Analogy • Simplest version is again reducible from MAX- CLIQUE • We will now use a probabilistic argument to create a Markov chain www.vistrails.org

  28. How does it work? • Module compatibility: prior f : M 2 → [ 0 , 1 ] • • Independent of graph topology • Probability of match between a pair • Dependent of graph topology • Linear combination of probability of match in the neighborhood pairs and data • This is a Markov chain! www.vistrails.org

  29. How does it work? • Graph product G of the two input graphs • each vertex in G represents a possible match • similarity is then defined as π = α A ( G ) π +( 1 − α ) c ( G ) = M G π • is an eigenvector of M G π • It is the limit distribution of the transition matrix www.vistrails.org

  30. How does it work? G A × G B G A G B www.vistrails.org

  31. How does it work? G A × G B G A G B www.vistrails.org

  32. How does it work? G A × G B G A G B www.vistrails.org

  33. How does it work? G A × G B G A G B www.vistrails.org

  34. How does it work? G A × G B G A G B www.vistrails.org

  35. How does it work? Each node is assigned some initial value. (It doesn’t matter which, as long as the values sum to one!) www.vistrails.org

  36. How does it work? p k ( a 0 → b 0 ) p k ( a 0 → b 1 ) p k ( a 0 → b 3 ) p k ( a 0 → b 2 ) p k ( a 1 → b 0 ) www.vistrails.org

  37. How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ p k ( a 0 → b 0 ) p k ( a 1 → b 0 )) p k ( a 0 → b 1 ) p k ( a 0 → b 3 ) p k ( a 0 → b 2 ) p k ( a 1 → b 0 ) www.vistrails.org

  38. How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ p k ( a 1 → b 0 )) www.vistrails.org

  39. How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ c ( a 0 , b 0 ) p k ( a 1 → b 0 )) www.vistrails.org

  40. How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ c ( a 0 , b 0 ) p k ( a 1 → b 0 )) Do it for all nodes, until convergence www.vistrails.org

  41. How does it work? • is defined over graph product π • For each module in the second pipeline, pick maximal value of on first pipeline: this is the π match • Many others possible www.vistrails.org

  42. The matching algorithm www.vistrails.org

  43. The matching algorithm www.vistrails.org

  44. The matching algorithm www.vistrails.org

  45. The matching algorithm www.vistrails.org

  46. Failure Modes • Analogies are not fool-proof www.vistrails.org

  47. Case study • Creating a complex visualization out of simple ones • (demo) www.vistrails.org

  48. Discussion • If your system can encode actions as functions on the space of objects of interest, store these explicitly • That will be your “version tree” - everything else is just the same • Easy to incorporate domain-specific knowledge in analogies: change and c ( G ) A ( G ) www.vistrails.org

  49. Acknowledgments • Sarang Joshi, Suresh Venkatasubramanian, Erik Anderson, João Comba • VisTrails dev team • Many open source packages and devs: VTK, SciPy, teem, matplotlib • VisTrails is open source! http://www.vistrails.org • Shameless plug: Visit the SCI booth! • NSF, DOE, IBM Faculty Award www.vistrails.org

  50. Thank you! • Questions? www.vistrails.org

  51. Too much data • We are better off with visualization systems than without - but it’s still pretty messy www.vistrails.org

  52. Video www.vistrails.org

Recommend


More recommend