scalable and robust management of dynamic graph data
play

Scalable and Robust Management of Dynamic Graph Data Alan G. - PowerPoint PPT Presentation

BD 3 2013 Scalable and Robust Management of Dynamic Graph Data Alan G. Labouseur Paul W. Olsen Jr. Jeong-Hyon Hwang {alan, polsen, jhh}@cs.albany.edu Sunday, September 22, 2013 Large, Dynamic Networks BD 3 2013 2 Sunday,


  1. BD 3 2013 Scalable and Robust Management of Dynamic Graph Data Alan G. Labouseur Paul W. Olsen Jr. Jeong-Hyon Hwang {alan, polsen, jhh}@cs.albany.edu Sunday, September 22, 2013

  2. Large, Dynamic Networks BD 3 2013 2 Sunday, September 22, 2013

  3. Large, Dynamic Networks • Social Networks BD 3 2013 2 Sunday, September 22, 2013

  4. Large, Dynamic Networks • Social Networks • Consumer Commerce Networks BD 3 2013 2 Sunday, September 22, 2013

  5. Large, Dynamic Networks • Social Networks • Consumer Commerce Networks • Financial Networks BD 3 2013 2 Sunday, September 22, 2013

  6. Large, Dynamic Networks • Social Networks • Consumer Commerce Networks • Financial Networks • Road Networks BD 3 2013 2 Sunday, September 22, 2013

  7. Large, Dynamic Networks • Social Networks • Consumer Commerce Networks • Financial Networks • Road Networks • Internet / WWW BD 3 2013 2 Sunday, September 22, 2013

  8. Large, Dynamic Networks • Social Networks • Consumer Commerce Networks • Financial Networks • Road Networks • Internet / WWW • DNA Interactions BD 3 2013 2 Sunday, September 22, 2013

  9. Analysis of Large, Dynamic Networks • Transportation 5:00 AM BD 3 2013 3 Sunday, September 22, 2013

  10. Analysis of Large, Dynamic Networks • Transportation 5:00 AM 9.1 mi, 20 mins BD 3 2013 3 Sunday, September 22, 2013

  11. Analysis of Large, Dynamic Networks • Transportation 5:00 AM 9.1 mi, 20 mins BD 3 2013 4 Sunday, September 22, 2013

  12. Analysis of Large, Dynamic Networks • Transportation 5:00 AM 6:00 AM 15 mi, 25 mins 9.1 mi, 20 mins BD 3 2013 4 Sunday, September 22, 2013

  13. Analysis of Large, Dynamic Networks • Transportation 5:00 AM 6:00 AM 15 mi, 25 mins 9.1 mi, 20 mins BD 3 2013 5 Sunday, September 22, 2013

  14. Analysis of Large, Dynamic Networks • Transportation 5:00 AM 6:00 AM 7:00 AM 15 mi, 25 mins 9.1 mi, 20 mins 20 mi, 30 mins BD 3 2013 5 Sunday, September 22, 2013

  15. Analysis of Large, Dynamic Networks • Transportation 5:00 AM 6:00 AM 7:00 AM 15 mi, 25 mins 9.1 mi, 20 mins 20 mi, 30 mins • Social and Political Studies / Marketing / National Security - How do communities or the centrality of an entity change over time? - Who are rising stars? BD 3 2013 6 Sunday, September 22, 2013

  16. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 1 c a d b α ...... β γ BD 3 2013 7 Sunday, September 22, 2013

  17. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 1 c c c a a d d d b b b α ...... β γ BD 3 2013 8 Sunday, September 22, 2013

  18. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 1 c c a a d d b b c a c d b b d G 1 G 1 G 1 α ...... β γ BD 3 2013 9 Sunday, September 22, 2013

  19. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 1 c c a a d d b b c a c d b b d G 1 G 1 G 1 α ...... β γ BD 3 2013 10 Sunday, September 22, 2013

  20. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 2 G 1 e e c c c a a d d b b c a c d b b d G 1 G 1 G 1 α ...... β γ BD 3 2013 10 Sunday, September 22, 2013

  21. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 2 G 1 e e c c c a a a d d d b b b c e a c c d b b d G 1 ∩ G 2 G 1 ∩ G 2 G 2 -G 1 G 1 ∩ G 2 G 1 -G 2 α ...... β γ BD 3 2013 11 Sunday, September 22, 2013

  22. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 2 G 1 e e c c c a a a d d d b b b c e a c c d b b d G 1 ∩ G 2 G 1 ∩ G 2 G 2 -G 1 G 1 ∩ G 2 G 1 -G 2 α ...... β γ BD 3 2013 12 Sunday, September 22, 2013

  23. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 2 G 1 G 3 e e c c c a a a f f d d d d b b b c e a c c d b b d G 1 ∩ G 2 G 1 ∩ G 2 G 2 -G 1 G 1 ∩ G 2 G 1 -G 2 α ...... β γ BD 3 2013 12 Sunday, September 22, 2013

  24. The G* System (1/2) • distributed, deduplicated storage of graph snapshots G 2 G 1 G 3 e e c c c a a a f d d d b b b c e a c c d f b b d d G 1 ∩ G 2 ∩ G 3 G 1 ∩ G 2 ∩ G 3 (G 2 ∩ G 3 )-G 1 (G 1 ∩ G 2 )-G 3 G 3 -G 1 -G 2 G 1 -G 2 -G 3 α ...... β γ BD 3 2013 13 Sunday, September 22, 2013

  25. The G* System (2/2) • sophisticated queries / sharing across graph snapshots c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  26. The G* System (2/2) • sophisticated queries / sharing across graph snapshots average union count, sum count, sum count, sum degree degree degree vertex vertex vertex c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  27. The G* System (2/2) • sophisticated queries / sharing across graph snapshots average union count, sum count, sum count, sum degree degree degree (c, ♢ ,{G 1 }), (d, ♢ ,{G 1 ,G 2 }), (c, ♢ ,{G 2 }), (e, ♢ ,{G 2 }) (a, ♢ ,{G 1 ,G 2 }) (b, ♢ ,{G 1 ,G 2 }) vertex vertex vertex vertex vertex vertex c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  28. The G* System (2/2) • sophisticated queries / sharing across graph snapshots average union count, sum count, sum count, sum (a,2,{G 1 ,G 2 }) (b,1,{G 1 ,G 2 }) (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), degree degree degree degree degree degree (c, ♢ ,{G 1 }), (d, ♢ ,{G 1 ,G 2 }), (c, ♢ ,{G 2 }), (e, ♢ ,{G 2 }) (a, ♢ ,{G 1 ,G 2 }) (b, ♢ ,{G 1 ,G 2 }) vertex vertex vertex vertex vertex vertex c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  29. The G* System (2/2) • sophisticated queries / sharing across graph snapshots average union (1,2,{G 1 ,G 2 }) (1,1,{G 1 ,G 2 }) (2,0,{G 1 }), (3,1,{G 2 })) count, sum count, sum count, sum count, sum count, sum count, sum (a,2,{G 1 ,G 2 }) (b,1,{G 1 ,G 2 }) (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), degree degree degree degree degree degree (c, ♢ ,{G 1 }), (d, ♢ ,{G 1 ,G 2 }), (c, ♢ ,{G 2 }), (e, ♢ ,{G 2 }) (a, ♢ ,{G 1 ,G 2 }) (b, ♢ ,{G 1 ,G 2 }) vertex vertex vertex vertex vertex vertex c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  30. The G* System (2/2) • sophisticated queries / sharing across graph snapshots average (1,2,{G 1 ,G 2 }), (1,1,{G 1 ,G 2 }), (2,0,{G 1 }), (3,1,{G 2 }) union union (1,2,{G 1 ,G 2 }) (1,1,{G 1 ,G 2 }) (2,0,{G 1 }), (3,1,{G 2 })) count, sum count, sum count, sum count, sum count, sum count, sum (a,2,{G 1 ,G 2 }) (b,1,{G 1 ,G 2 }) (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), degree degree degree degree degree degree (c, ♢ ,{G 1 }), (d, ♢ ,{G 1 ,G 2 }), (c, ♢ ,{G 2 }), (e, ♢ ,{G 2 }) (a, ♢ ,{G 1 ,G 2 }) (b, ♢ ,{G 1 ,G 2 }) vertex vertex vertex vertex vertex vertex c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  31. The G* System (2/2) • sophisticated queries / sharing across graph snapshots (3/4, G 1 ), (4/5, G 2 ) average average (1,2,{G 1 ,G 2 }), (1,1,{G 1 ,G 2 }), (2,0,{G 1 }), (3,1,{G 2 }) (1,2,{G 1 ,G 2 }), (1,1,{G 1 ,G 2 }), (2,0,{G 1 }), (3,1,{G 2 }) union union (1,2,{G 1 ,G 2 }) (1,1,{G 1 ,G 2 }) (2,0,{G 1 }), (3,1,{G 2 })) count, sum count, sum count, sum count, sum count, sum count, sum (a,2,{G 1 ,G 2 }) (b,1,{G 1 ,G 2 }) (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), (c,0,{G 1 }), (d,0,{G 1 ,G 2 }), (c,1,{G 2 }), (e,0,{G 2 }), degree degree degree degree degree degree (c, ♢ ,{G 1 }), (d, ♢ ,{G 1 ,G 2 }), (c, ♢ ,{G 2 }), (e, ♢ ,{G 2 }) (a, ♢ ,{G 1 ,G 2 }) (b, ♢ ,{G 1 ,G 2 }) vertex vertex vertex vertex vertex vertex c e a c c d f b b d d {G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 , G 3 } {G 1 , G 2 } {G 3 } {G 1 } α β γ ...... BD 3 2013 14 Sunday, September 22, 2013

  32. Problem Statements • How to distribute graph snapshots on G* workers? - new graph snapshots generated continuously - must be efficient, scalable, and optimized for queries • How to replicate graph snapshots? - aim to maximize both availability and performance BD 3 2013 15 Sunday, September 22, 2013

  33. Impact of Snapshot Distribution (Example) • 100 similarly-sized graph snapshots • 100 G* workers • PageRank on one snapshot or all snapshots query 1 worker/snapshot 100 workers/snapshot one snapshot 300 seconds 20 seconds all snapshots 300 seconds 2,000 seconds 16 BD 3 2013 Sunday, September 22, 2013

Recommend


More recommend