Graph Exploration w/ Neo4j 1 https://s3.amazonaws.com/dev.assets.neo4j.com/wp-content/uploads/graph-data-technologies-graph-databases-for-beginners.png
Our Project Partners 2
Efficiently extracting knowledge from graph data even if we do not know exactly what we are looking for Graph Exploration: From Users to Large Graphs. CIKM 2016, SIGMOD 2017, KDD 2018 GRAPH EXPLORATION 3
Graph Exploration Stack Users Interactive algorithms Intuitive queries Adaptive Databases Graph 4 GRAPH EXPLORATION
Graph Exploration Stack Users Interactive algorithms . . . t u o b a s i t c e j o r p s i h T Intuitive queries Adaptive Databases Graph 5 GRAPH EXPLORATION
Graph Exploration in Biology - Complex Graphs 3,27 × 10 9 base pairs 6 http://jcs.biologists.org/content/joces/118/21/4947/F3.large.jpg
Graph Exploration in Biology - Status Quo M A T C H ( p 1 : P h e n (p1:Phenotype)-[:HAS]-(a1:Association)-[:HAS]-(snp:Snp)-[:HAS]-(a2:Association)-[:HAS]-(p2:Phenotype o MATCH t y p e ) - W [ : H H A E S R ] - E ( a 1 p (p1:Phenotype)-[:HAS]-(a1:Association)-[:HAS]-(snp:Snp)-[:HAS]-(a2:Association)-[:HAS]-(p2:Phenotype) 1 : A . n s a s m o c e i a A = t N ' i o D f o n o ) p 1 - [ 2 ' : H . n a A WHEREp1.name = 'foo1' m S ] - e ( s = n ‘ p M A f o : A N o S T D 2 n C H ' p a ( ) - 1 p 1 [ : . p : P H h e A ANDp2.name = ‘foo2' < n o S 0 t y ] . p e - ( 0 ) - a 1 [ : H 2 W MATCH A A S : A W N ] - s H I T D ( a s E R H 1 : A o E a s c p D 1 2 s o i I . n . c i a a S a m p t i t i T e < o n o ANDa1.p < 0.01 AND a2.p < 0.01 I N = 0 ) - [ n C ' f o . : H ) - o 1 0 A S [ : T ‘ A 1 ] - H M s N ( s n A W n D p S I T A p p 2 : S n ] H T . n p - ( D C a m ) - [ : p I S H e H A 2 T I = S : P ( N C ' f o ] - ( h s T o a 2 e WITH DISTINCT snp n s n 2 p : A n p p ‘ A s s o ) - N o c t [ : D a i a y I N 1 t i o p W ] . p n ) e O R H - ( < 0 - [ : ) D p . 0 H A E E R WHERE p1.name = 'foo1' w 1 S R B ) : A N ] - ( E Y P D p 2 l s n o a : P . f p . s 2 . p h e e s i d i t < n o MATCH(snp)-[:IN]-(pw:PositionWindow)<-[:IN]-(l:Locus)--(g:Gene) a i o 0 t y t u n . 0 p e r W 1 ) e W = i n R ' g d E T I T e o U H n w R N e D c AND p2.name = ‘foo2' ' ) < I S o l - T l e c [ : I t ( N WHERE l.feature= 'gene' I N s n ] C p . - ( AND a1.p < 0.01 AND a2.p < 0.01 T s i d l : ) L M g o O c A R u T D s C E ) - H MATCH (snp)-[:IN]-(pw:PositionWindow)<-[:IN]-(l:Locus)--(g:Gene) - ( R g ( g B : G ) Y e - n WITH DISTINCT g ORDER BY g.name [ : g e C . n ) O a W D m E H S e E ] R - ( E : T r s WITH DISTINCT snp a . n n a s MATCH(g)-[:CODES]-(:Transcript)-[:CODES]-(p:Protein)-[:MEMBER]-(go:Goterm) m c r i e p = t ) R ' - [ E m : I T u S U s ] - R t ( p N a s f a : D v P I i ' r o S WHERE go.namespace= 'biological_process' T b I N e s C e t T ) g - [ : . S WHERE l.feature= 'gene' n I G a m ] - e ( s : S WITH DISTINCT go,p RETURN collect(DISTINCT g.name) a m p l e ) RETURN go.name, count(p) ORDER BY count(p)DESC LIMIT 10 7
Can we do better? 8
Problem ● Given two node sets: How similar are they in my understanding? ● Example → Set of movies I like Set of movies I don’t know → → Will I like the movies I don’t know? The Skyfall Matrix Pulp Hell or Fiction ? High Water As Good Avatar As It Gets 9
What is a Knowledge Graph? (directed) graph G : ⟨ V, E, φ, ψ ⟩ , where ● V is a set of nodes, ○ E ⊆ V × V is a set of edges, ○ φ : V → L V is an edge labeling function ○ and ψ : E → L E is a node labeling function ○ We refer to the elements of L V and L E as node labels and edge labels 10
What are Meta-Paths? A meta-path for a path ⟨ n 1 , ..., n t ⟩ , n i ∈ V , 1 ≤ i ≤ t is a sequence P : ⟨ φ(n 1 ),ψ (n 1 , n 2 ), ..., ψ (n t−1 , n t ), φ(n t ) ⟩ that alternates node- and edge-types along the path. Graph Path Graph Schema Meta-Path 11
Motivating Example Q: How famous is Diane Kruger in America? MATCH(n:Person) MATCH(m:Movie) WHERE n.name = “Diane Kruger” WHERE m.location = “America” RETURN n RETURN m The Matrix Stand Top By Me Gun Diane Kruger Pulp Fiction A Few As Good Good Men As It Gets Up 12
How similar are they? ● Similarity depends on expert knowledge ○ connections among nodes ○ 13
Overview What does the System do and how? ✗ ✓ ◎ Individualized Compute Meta-Paths Extract ratings exploration Calculate similarity Learn representation for meta-paths 14
Meta-Paths Computation Approximate Meta-Paths Problem: How to compute all meta-paths fast? Approx. Solution : Mine meta-paths using the graph’s schema and learn classifier on real meta-paths Compute schema 15
Meta-Paths Embedding Learning a Meta-Path Embedding Problem: Vector representation required for active learning and preference prediction. ? (3 5 1) T 16
Meta-Paths Embedding Learning a Meta-Path Embedding Problem: Vector representation required for active learning and preference prediction. Solution: Embed meta-paths → Similar meta-paths should have similar vectors. Our method : Transfer text embedding method to meta-paths. (3 5 1) T 17
Active Learning Learn the Domain Value of all Meta-Paths - Problem : Users don’t want to rate all meta-paths → too many → time-consuming → tedious and boring ✗ ✓ - Solution : Label only a few, but very informative paths ◎ 18
Result Explanation Use Learned Preferences for Graph Exploration Graph (with meta-paths) Personalized Exploration What is Tool important in the graph? Similarity Measure Stats Related Nodes Domain Knowledge 19 Icons made by Eucalyp from www.flaticon.com is licensed by CC 3.0 BY
Result Explanation Personalized Node Embedding precomputed Transform Nodes to Adapt Vectors Using Personalized Vector Vectors Domain-Knowledge Space (Graph-Embedding) 20
Applications Personalized Exploration Tool What nodes are close to How close are my selection? my sets? What are Find clusters! outliers? Personalized Vector Space 21
System Architecture - How does it work with Neo4j? ReactJS Frontend Meta-Path Result Node selection ordering visualization Python Backend Server Meta-Path Active Learning Explanation Embedding Neo4j Graph Algorithm Procedures Containing Meta-Paths Computation Neo4j Graph Database 22
Meta-Paths Computation What about neo4j? ● Easy to get your code running in neo4j. Neo4j-graph-algorithms: efficiency vs convenience. ● ● Sometimes no stack-trace when an error occurs. ● Great support and community. Always available. ● Cypher: Easy to begin with, hard to master. (hpi)-[:LIKES]->(neo4j) 23
Trending: #tweetyourthesis 24
Trending: #tweetyourthesis 25
Recommend
More recommend