learning to rank and compare graph layouts
play

Learning to rank and compare graph layouts Toby Dylan Hocking - PowerPoint PPT Presentation

Learning to rank and compare graph layouts Toby Dylan Hocking toby@sg.cs.titech.ac.jp http://sugiyama-www.cs.titech.ac.jp/~toby/ joint work with Supaporn Spanurattana and Masashi Sugiyama 6 Aug 2013 Introduction: what makes a graph layout good


  1. Learning to rank and compare graph layouts Toby Dylan Hocking toby@sg.cs.titech.ac.jp http://sugiyama-www.cs.titech.ac.jp/~toby/ joint work with Supaporn Spanurattana and Masashi Sugiyama 6 Aug 2013

  2. Introduction: what makes a graph layout good or bad? Learning to rank and compare graph layouts

  3. Biology is full of networks (graphs) Source: Kyoto encyclopedia of genes and genomes (KEGG).

  4. Biology is full of networks (graphs) Source: Wikipedia “Citric acid cycle.”

  5. Goal: find a good layout for a particular graph Two categories of methods for graph layout ◮ Heuristic layout algorithms: ◮ Force-directed ◮ Hierarchical clustering (trees/dendrograms) ◮ Hive plots ◮ ... ◮ Manual layout using programs such as: ◮ Cytoscape/cytoscape.js ◮ Gephi ◮ Image processing: gimp/inkscape ◮ ...

  6. Force-directed layout has many tuning parameters Source: Data-Driven Documents (D3) JavaScript visualization library (Bostock 2011). parameter min default max size ? 1 x 1 ? link distance 0 20 ∞ link strength 0 1 1 friction 0 0.9 1 charge −∞ -30 ∞ theta 0 0.8 ∞ gravity 0 0.1 ∞ Question: how to tune these parameters for a specific graph?

  7. Manual layout using a GUI is time-consuming ◮ Try default parameters of several different algorithms. ◮ Play with tuning parameters, select a combination that looks good. ◮ Finally, refine the algorithm’s layout by dragging nodes to positions that look better. Goal: learn from a database of manually labeled graphs.

  8. Manual layout using a GUI is time-consuming ◮ Try default parameters of several different algorithms. ◮ Play with tuning parameters, select a combination that looks good. ◮ Finally, refine the algorithm’s layout by dragging nodes to positions that look better. Goal: learn from a database of manually labeled graphs.

  9. Pairwise comparison in the graph layout literature Source: Holten and van Wijk, “Force-Directed Edge Bundling for Graph Visualization,” EuroVis 2009.

  10. Pairwise comparison in the graph layout literature Source: Muelder and Ma, “Rapid Graph Layout Using Space Filling Curves,” InfoVis 2008.

  11. Pairwise comparison in the graph layout literature Source: Gorochowski et al. , “Using Aging to Visually Uncover Evolutionary Processes on Networks,” IEEE Trans. Viz 2012.

  12. Introduction: what makes a graph layout good or bad? Learning to rank and compare graph layouts

  13. Learning a comparison function We are given n training pairs ( G i , x i , x ′ i , y i ) where we have ◮ a graph G i , i ∈ R p of that graph (feature vectors), ◮ two layouts x i , x ′  − 1 if x i is better   ◮ a comparison y i = 0 if x i is as good as x ′ i  1 if x ′ i is better .  Goal: find a comparison function g : R p × R p → {− 1 , 0 , 1 } ◮ Symmetry: g ( x , x ′ ) = − g ( x ′ , x ). ◮ Good prediction with respect to the zero-one loss E : � � y i , g ( x i , x ′ � minimize E i ) g i ∈ test

  14. Learning to rank and compare We will learn a ◮ Ranking function f : R p → R . Bigger means a better layout. ◮ Threshold t ∈ R + . A small difference | f ( x ′ ) − f ( x ) | ≤ t is not significant.  if f ( x ′ ) − f ( x ) < − t − 1   ◮ Comparison function g t ( x , x ′ ) = 0 if | f ( x ′ ) − f ( x ) | ≤ t  1 if f ( x ′ ) − f ( x ) > t .  The problem becomes n � � y i , g t ( x i , x ′ � minimize E i ) f , t i =1

  15. Some labeled layouts of a 2-node graph good 1 good 2 good 3 200 150 100 50 bad 11 bad 12 bad 13 y 200 150 100 50 -300 -200 -100 0 -300 -200 -100 0 -300 -200 -100 0 x

  16. Map 20 layouts x i ∈ R 2 to a feature space 1.6 1.2 label angle good 0.8 bad 0.4 0 100 200 300 distance

  17. Generate 10 pairwise constraints x ′ i − x i ∈ R 2 1.6 1.2 label angle good 0.8 bad 0.4 0 100 200 300 distance

  18. 10 labeled difference vectors x ′ i − x i ∈ R 2 1 comparison y i angle -1 0 0 1 -1 -200 0 200 distance

  19. All 190 labeled difference vectors x ′ i − x i ∈ R 2 1 comparison y i angle -1 0 0 1 -1 -200 0 200 distance

  20. Max margin comparison function line margin decision 1 comparison y i angle -1 0 0 1 -1 constraint active f ( x ′ ) − f ( x ) = − 1 f ( x ′ ) − f ( x ) = 1 inactive -200 0 200 distance

  21. g when switching train direction x i , x ′ Invariance of ˆ i line margin 1 decision comparison angle y i 0 -1 0 constraint -1 active inactive f ( x ′ ) − f ( x ) = − 1 f ( x ′ ) − f ( x ) = 1 -200 0 200 distance

  22. Defining the margin Recall: for all pairs i ∈ { 1 , . . . , n } we have i ∈ R p and ◮ features x i , x ′ ◮ comparisons y i ∈ {− 1 , 0 , 1 } . We define ◮ Ranking function f ( x ) = w ⊺ x ∈ R . ◮ Threshold t = 1. ◮ Comparison function g 1 ( x , x ′ ) ∈ {− 1 , 0 , 1 } . y i = − 1 y i = 0 y i = 1 1 margin µ 0 -1 0 1 -1 0 1 -1 0 1 predicted rank difference f ( x ′ i ) − f ( x i )

  23. Max margin comparison is a linear program (LP) For y ∈ {− 1 , 0 , 1 } , let I y = { i | y i = y } be the corresponding training indices. maximize µ ∈ R , w ∈ R p µ subject to µ ≤ 1 − | w ⊺ ( x ′ i − x i ) | , ∀ i ∈ I 0 µ ≤ − 1 + w ⊺ ( x ′ i − x i ) y i , ∀ i ∈ I 1 ∪ I − 1 . Note: if the optimal µ > 0 then the data are separable.

  24. Related work: reject, rank, and rate ❳❳❳❳❳❳❳❳❳❳❳ Inputs single items x pairs of items x , x ′ Outputs y ∈ {− 1 , 1 } SVM SVMrank y ∈ {− 1 , 0 , 1 } Reject option this work ◮ PL Bartlett and MH Wegkamp. Classification with a reject option using a hinge loss. JMLR, 9:1823–1840, 2008. (statistical properties of the hinge loss) ◮ T Joachims. Optimizing search engines using clickthrough data. KDD 2002. (SVMrank) ◮ K Zhou et al. Learning to rank with ties. SIGIR 2008. (boosting, ties are more effective with more output values) ◮ R Herbrich et al. TrueSkill: a Bayesian skill rating system. NIPS 2006. (generalization of Elo for chess)

  25. SVMrank is a quadratic program (QP) minimize w ⊺ w w ∈ R p subject to w ⊺ ( x ′ i − x i ) y i ≥ 1 , ∀ i ∈ I 1 ∪ I − 1 . line 4 margin f ( x ′ ) − f ( x ) = 0 decision 2 comparison y i angle -1 0 0 1 -2 constraint active f ( x ′ ) − f ( x ) = − 1 f ( x ′ ) − f ( x ) = 1 inactive -2 -1 0 1 2 distance

  26. Conclusions and future work Learned a function f ( x ) for ranking a graph layout x . ◮ Features for good performance on real graphs? ◮ Tune layout algorithm parameters to maximize f . ◮ SVMrank is sufficient under what assumption?

  27. Thank you! Supplementary slides appear after this one.

  28. Layout evaluation metrics (features x i , x ′ i ) ◮ Number of crossing edges (smaller is better) ◮ Aspect ratio (closer to 1:1 is better?) ◮ Symmetry (more is better when the graph has symmetries) ◮ Edge length (small and less variable is better?) ◮ Angle between edge pairs (big is better?) ◮ Area of smallest bounding box (smaller is better to let small features be more legible) Source: http://en.wikipedia.org/wiki/Graph_drawing# Quality_measures

Recommend


More recommend