raphtory streaming analysis of distributed temporal graphs
play

Raphtory : Streaming Analysis Of Distributed Temporal Graphs - PowerPoint PPT Presentation

Raphtory : Streaming Analysis Of Distributed Temporal Graphs Benjamin Steer , Felix Cuadrado & Richard G. Clegg 1 Motivation Traditional Graph Processing Systems Chosen Computation Output Graph Snapshot Processing Snapshot 1 Snapshot 2


  1. Raphtory : Streaming Analysis Of Distributed Temporal Graphs Benjamin Steer , Felix Cuadrado & Richard G. Clegg 1

  2. Motivation Traditional Graph Processing Systems Chosen Computation Output Graph Snapshot Processing Snapshot 1 Snapshot 2 Chosen Computation Snapshot 2 Output for each snapshot 2

  3. Motivation Stream-Based Graph Processing Platform User Graph maintained in-memory Event source ■ Analysis on the most recent Graph ■ Near real-time updates to metrics ■ Compare new updates to previous state ■ Temporal graph analysis 3

  4. Raphtory features • Temporal Graph Model • Formalisation of model and update semantics • Distributed graph management • Stream Ingestion and near real-time maintenance • Pregel-like temporal Graph Analysis • Live, view and temporal range analysis 4

  5. Raphtory Design Implemented in Scala using the Akka actor model [Raphtory: Streaming analysis of distributed temporal graphs, Future Generation Computer Systems 2020, Vol 102, pp 453-464] 5

  6. Partition Manger Ingestion Created: t8 Created: t14 Created: t14 Deleted: t15 Edge 1 à 2 Vertex 1 Partition 1 Created: t14 Created: t14 Deleted: t15 Deleted: t15 Edge 1 à 2 Edge 1 à 2 Vertex 2 6 Partition 2

  7. Correct update order { "Edge Add":{ ”Message Time": 14, ”Source ID":1, ”Destination ID":2 } } Created: t9 Created: t8 Created: t14 Created: t14 Created: t14 Created: t14 Edge 1 à 2 Edge 1 à 2 Vertex 2 Vertex 1 Partition Manager 2 Partition Manager 1 7

  8. Edge Added Before Vertex { { "Edge Add":{ ”Vertex Add":{ ”Message Time": 14, ”Message Time": 8, ”Source ID":1, ”Source ID":1 ”Destination ID":2 } } } } Created: t8 Created: t14 Created: t14 Created: t14 Created: t14 Edge 1 à 2 Edge 1 à 2 Vertex 2 Vertex 1 Partition Manager 2 Partition Manager 1 8

  9. Vertex Deletion Before Edge Addition { { ”Vertex Removal":{ "Edge Add":{ ”Message Time": 15 ”Message Time": 14, ”Source ID":1, ”Source ID":2 ”Destination ID":2 } } } } Created: t14 Created: t8 Created: t14 Created: t14 Deleted: t15 Created: t14 Deleted: t15 Deleted: t15 Edge 1 à 2 Edge 1 à 2 Vertex 2 Vertex 1 Partition Manager 2 Partition Manager 1 9

  10. Analysis Partition Partition Router Manager Manager Analysis Analysis Request Router Manager Partition Partition Router Manager Manager Individual Responses 10

  11. Live Graph, Views & Snapshots 11

  12. Views & Windowing Window (Left Hand Filter) View (Right Hand Filter) t 10 t 0 t 5 t n Full History of the Graph Window Size = 5

  13. Windowing Batches Batch of Windows (Decreasing in size) t 10 t 0 t 5 t 7 t 9 t n Full History of the Graph Window Sizes = [5,3,1]

  14. Temporal Range Analysis Range of Interest = t 4 -> t 10 Interval = 2 t 6 t 10 t 0 t 4 t 8 t n Full History of the Graph

  15. Gab.ai Connected Components Every Hour Across Lifetime

  16. Gab.ai Connected Components Every Hour Across Lifetime Largest Connected Component

  17. Using Raphtory • Available at github: https://github.com/miratepuffin/raphtory • Includes starting documentation and tutorials • Readme goes through a single machine dockerised version that runs connected components over Gab graph. • Multiple spouts (parsing data from Gab, Twitter, Bitcoin, Ethereum) • Multiple analysis functions implemented (on views, ranges, window) • Connected Components • Information Diffusion • Top Degree vertex rankings 17

  18. Future Roadmap and Getting Involved Drop me a line at b.a.steer@qmul.ac.uk Raise PR’s/Queries on Git https://github.com/miratepuffin/raphtory 18

Recommend


More recommend