Querying and Creating Visualizations by Analogy Carlos E. - PowerPoint PPT Presentation

Querying and Creating Visualizations by Analogy Carlos E. Scheidegger, Huy T. Vo, David Koop, Juliana Freire, Cláudio T. Silva SCI Institute, School of Computing University of Utah

Outline • Provenance reuse • We have all this rich metadata - let’s use it • Query-by-example • Visualization by Analogy • (VisTrails intro) • Transparent provenance tracking www.vistrails.org

Related Work • Visualization Systems and Libraries • AVS, DX, SCIRun, VTK • History tracking and formalisms • Jankun-Kelly et al’s pset-calculus • Kreuseler et al, VDM history • Brodlie’s et al’s GRASPARC • VisTrails www.vistrails.org

Provenance • The “pedigree” of an artifact • Where did it come from? Who held it? www.vistrails.org

Provenance in VisTrails • Process provenance • How was this visualization created? www.vistrails.org

Version Tree • Persistent • Transparent • Reuse • Can we do better than just presenting? www.vistrails.org

Why not query languages? www.vistrails.org

Why not query languages? wf{*}: upstream(x) union x where x.module = “SoftMean” and executed (x) and y in upstream(x) and y.module = “AlignWarp” and y.parameter(“model”) = “12” www.vistrails.org

Why not query languages? This is still only mildly better than straight SQL... Does not expose mapping to relational schema wf{*}: upstream(x) union x where x.module = “SoftMean” and executed (x) and y in upstream(x) and y.module = “AlignWarp” and y.parameter(“model”) = “12” www.vistrails.org

Query-by-Example • Do not teach the user new forms of interaction! www.vistrails.org

Visualization by Analogy • Create new visualizations by saying “do as they did” • Specify what , not how www.vistrails.org

Query-by-Example • Trivially reducible from MAX-CLIQUE • ... and MAX-CLIQUE is NP-Complete • ... and MAX-CLIQUE is fundamentally hard to approximate • Solution: algorithm tailored to problem domain www.vistrails.org

Query-by-Example • Split every subgraph in topologically sorted layers • Ok, since all pipelines are DAGs in VisTrails 1 2 3 www.vistrails.org

Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 2 3 www.vistrails.org

Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 1 2 2 3 3 4 5 Match www.vistrails.org

Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 1 1 2 2 2 3 3 4 3 4 5 Match No match www.vistrails.org

Query-by-Example • Now search for layers that are connected in the same way in the database Query Database 1 1 1 1 2 2 2 2 3 3 3 4 3 4 5 Match No match No match www.vistrails.org

Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 2 3 • Not too harmful - most modules cannot connect to one another www.vistrails.org

Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 1 2 2 3 3 4 5 • Not too harmful - most modules cannot connect to one another www.vistrails.org

Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 1 1 2 2 2 3 3 3 4 4 5 5 • Not too harmful - most modules cannot connect to one another www.vistrails.org

Query-by-Example • Might return false positives - it ignores the particular connectivity between topological layers Query Database 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 5 5 5 • Not too harmful - most modules cannot connect to one another www.vistrails.org

QBE Demo www.vistrails.org

Vistrail diffs • A version tree stores a set of actions • Each action is a function on the set of all possible visualizations: V → V • a n ◦ a n − 1 ◦ a n − 2 · · · ◦ a 0 • We can use those to determine the difference between visualizations • Moving up, then down the version tree www.vistrails.org

Vistrail diffs a 0 a 2 a 1 a 3 A B a 3 ◦ a 2 ◦ a − 1 ◦ a − 1 • Action to go from A to B is 0 1 www.vistrails.org

Visualization by Analogy • A diff is a template: reapply it elsewhere • How do we match two pipelines? www.vistrails.org

Algorithm Overview • Compute the difference δ ab = ∆( p a , p b ) • Compute the map map ac = map ( p a , p c ) • Apply to δ � cb = map ac ( δ ab ) δ ab map ac p d = δ � • Compute the new pipeline cb ( p c ) www.vistrails.org

Visualization by Analogy • Simplest version is again reducible from MAX- CLIQUE • We will now use a probabilistic argument to create a Markov chain www.vistrails.org

How does it work? • Module compatibility: prior f : M 2 → [ 0 , 1 ] • • Independent of graph topology • Probability of match between a pair • Dependent of graph topology • Linear combination of probability of match in the neighborhood pairs and data • This is a Markov chain! www.vistrails.org

How does it work? • Graph product G of the two input graphs • each vertex in G represents a possible match • similarity is then defined as π = α A ( G ) π +( 1 − α ) c ( G ) = M G π • is an eigenvector of M G π • It is the limit distribution of the transition matrix www.vistrails.org

How does it work? G A × G B G A G B www.vistrails.org

How does it work? Each node is assigned some initial value. (It doesn’t matter which, as long as the values sum to one!) www.vistrails.org

How does it work? p k ( a 0 → b 0 ) p k ( a 0 → b 1 ) p k ( a 0 → b 3 ) p k ( a 0 → b 2 ) p k ( a 1 → b 0 ) www.vistrails.org

How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ p k ( a 0 → b 0 ) p k ( a 1 → b 0 )) p k ( a 0 → b 1 ) p k ( a 0 → b 3 ) p k ( a 0 → b 2 ) p k ( a 1 → b 0 ) www.vistrails.org

How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ p k ( a 1 → b 0 )) www.vistrails.org

How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ c ( a 0 , b 0 ) p k ( a 1 → b 0 )) www.vistrails.org

How does it work? p k +1 ( a 0 → b 0 ) = (1 − α ) c ( a 0 , b 0 ) + α/ 3 ( p k ( a 0 → b 3 )+ p k ( a 0 → b 1 )+ c ( a 0 , b 0 ) p k ( a 1 → b 0 )) Do it for all nodes, until convergence www.vistrails.org

How does it work? • is defined over graph product π • For each module in the second pipeline, pick maximal value of on first pipeline: this is the π match • Many others possible www.vistrails.org

The matching algorithm www.vistrails.org

Failure Modes • Analogies are not fool-proof www.vistrails.org

Case study • Creating a complex visualization out of simple ones • (demo) www.vistrails.org

Discussion • If your system can encode actions as functions on the space of objects of interest, store these explicitly • That will be your “version tree” - everything else is just the same • Easy to incorporate domain-specific knowledge in analogies: change and c ( G ) A ( G ) www.vistrails.org

Acknowledgments • Sarang Joshi, Suresh Venkatasubramanian, Erik Anderson, João Comba • VisTrails dev team • Many open source packages and devs: VTK, SciPy, teem, matplotlib • VisTrails is open source! http://www.vistrails.org • Shameless plug: Visit the SCI booth! • NSF, DOE, IBM Faculty Award www.vistrails.org

Thank you! • Questions? www.vistrails.org

Too much data • We are better off with visualization systems than without - but it’s still pretty messy www.vistrails.org

Video www.vistrails.org

Querying and Creating Visualizations by Analogy Carlos E. - PowerPoint PPT Presentation

Querying and Creating Visualizations by Analogy Carlos E. Scheidegger, Huy T. Vo, David Koop, Juliana Freire, Cludio T. Silva SCI Institute, School of Computing University of Utah Outline Provenance reuse We have all this rich

Why do imitation and analogy fail? Why do imitation and analogy fail? Imitation Imitation

Reflection-based Word Attribute Transfer Background Analogy Analogy in the embedding space

OUTLINE Introduction to Analogous System. Force-Voltage Analogy. Force-Current Analogy.

Creative Visualizations A non-programming approach to create instant data visualizations through

Advanced Visualizations Advanced Visualizations Programming for Statistical Programming for

Computing Storyline Visualizations with Few Block Crossings Thomas C. van Dijk Fabian Lipp

Week 6 Video 5 Visualization Other Awesome EDM Visualizations Other Awesome EDM Visualizations

QUERYING AND MINING QUERYING AND MINING DATA STREAMS Elena Ikonomovska Joef Stefan Institute

Querying and Mining Data Streams: Querying and Mining Data Streams: You Only Get One Look You

Combining XML querying Combining XML querying with ontology reasoning: with ontology reasoning:

Querying XML Documents Querying XML Documents How XML may be supported in databases with

Wavelets for Efficient Querying of Large Wavelets for Efficient Querying of Large

The problem Combining querying of XML data with ontology queries Example XML document

Creating visualizations using Linked Data Alvaro Graves gravea3@rpi.edu @alvarograves 1

Module 7: Creating and Maintaining Indexes Overview Creating Indexes Creating Index

Exploring Syllables, Romanization, and Analogy in Names Deryle Lonsdale BYU Linguistics

PORTSIDE ISSUE 20, SPRING 2014 The Offjcial Newsletter of Transnet Port Terminals, a Division of

Major Incident Plan, Flooding Update and Future Flood Alleviation Projects Sam Barstow, Colin

The Education People update to the Kent Governance Association Richard Hallett Director of

Career Technical Education Advisory Committee Meeting Thursday, March 17, 2016 10:00 AM - 12:00

An adventure in audio description I could feel a real sense of community among us all, writers

J ournal of Health Law Winter 2003 Volume 36, No. 1 Revocation of Tax- Exempt Status, Excise

Federal Way Public Schools 2018 Facilities Planning Committee February 15, 2017 Federal Way

The Quality Checkers Project Employing the Quality Checkers There were information days in