X-Stream: Edge-centric Graph Processing using Streaming Partitions - PowerPoint PPT Presentation

Aug 17, 2022 •148 likes •349 views

X-Stream: Edge-centric Graph Processing using Streaming Partitions AMITABHA ROY, IVO MIHAILOVIC, WILLY ZWAENEPOEL PRESENTED BY: MAREK STRELEC Motivation q Large graphs billions of vertices and edges q Process on large clusters q Pregel,

X-Stream: Edge-centric Graph Processing using Streaming Partitions AMITABHA ROY, IVO MIHAILOVIC, WILLY ZWAENEPOEL PRESENTED BY: MAREK STRELEC
Motivation q Large graphs – billions of vertices and edges q Process on large clusters q Pregel, GraphLab, PowerGraph, Niad q Complexity and cost q Process on a single machine q GraphChi, X-Stream q 64 GB RAM, 32 cores, 2 x 200 GB SSD, 3 x 3TB drive
Vertex-centric processing model q “Think like a vertex” q Popularized by the Pregel and GraphLab projects q Mutable states stored in vertices q Scatter-Gather model q Scatter updates along outgoing edges q Gather updates from incoming edges
Vertex-centric BFS
Vertex-centric BFS
Vertex-centric BFS
Vertex-centric BFS
Sequential vs. Random access q Graph traversal = Random access q For all storage media (RAM, SSD, and HDD) q Sequential bandwidth >> random access bandwidth q HDD - 300x higher q SSD - 30x higher q RAM (1 core) - 4.6x higher q RAM (16 cores) - 1.8x higher
X-stream processing model: Edge-centric q Input to X-stream is an unordered set of directed edges q For undirected graphs - pair of directed edges q Scatter and Gather phases iterate over vertices edges q X-stream makes graph access sequential
Edge-centric BFS
Edge-centric BFS
Edge-centric BFS
Edge-centric BFS
Edge-centric properties q Many sequential scans of the edge list q The order of edges is irrelevant q Tradeoff q Sequential access is faster q More Scatter/Gather iterations q The number of iterations might be fever if the edge set >> vertex set q Problem: still have random access to vertex set
Streaming partitions q Partition the graph into streaming partitions q vertex set: a subset of vertices that fit into RAM q edge list: all edges whose source vertex is in the partition’s vertex set q update list: all updates whose destination vertex is in the partition’s vertex set q Streaming partitions can be processed in parallel q Vertices (random access) => fast storage, Edges (sequential access) => slow storage q The number of partitions is crucial for performance q Shuffle phase - updates must be re-arranged after the scatter phase
Scalability q Increasing thread count q Increasing number of I/O devices q Across devices Traversal algorithms – BFS, WCC Multiplication algorithms – PageRank, SpMW
Comparison with Other Systems: Ligra q Ligra q In-memory graph processing system q Requires pre-processing
Comparison with Other Systems: GraphChi q GraphChi q Traditional vertex-centric approach q Out-of-core data structure, parallel sliding windows, to reduce the amount of random access to disk q needs time to pre-sort the graph into shards
Criticism q Assumes that the number of edges is larger than the number of vertices q Performs well only on graphs with a low diameter q Workload imbalance as the partitions can have different numbers of edges assigned to them q Is work stealing sufficient?
Thank you!

Recommend

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

We are here EaaS 1.0 EaaS 2.0 Pre-Edge CSP Led Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge Service Edge On Premise Edge Edge Edge (EaaS) Edge Client Client Client Client Machine

439 views • 13 slides

Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge

Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the

606 views • 45 slides

X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic,

X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel (SOSP13) Presented by: Stella Lau 24 October 2017 Motivation: scalable graph processing Problem Performance of large scale

450 views • 29 slides

GraVF: GraVF: A Vertex-Centric A Vertex-Centric Graph Processing Graph Processing Framework

GraVF: GraVF: A Vertex-Centric A Vertex-Centric Graph Processing Graph Processing Framework Framework on FPGA on FPGA Nina Engelhardt August 31, 2016 Graphs and Graph Traversal Algorithms 1 Vertex-centric Programming Model: From POV of

262 views • 9 slides

X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic,

X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel Context Approach Model Implementation Results & Conclusion Pregel & Powergraph: scatter & gather A

299 views • 25 slides

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch processing Bounded input Bounded one-shot computation Bounded output Stream processing Unbounded input: data stream

624 views • 23 slides

Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri

Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri Outline Distributed Graph Processing Gelly: Batch Graph Processing with Flink Gelly-Stream: Continuous Graph Processing with Flink WHEN

1.12k views • 90 slides

Training Presentation Web Streaming Introduction What is Web Streaming? Who is Streaming?

S A N D Y A U D I O V I S U A L Training Presentation Web Streaming Introduction What is Web Streaming? Who is Streaming? Web Stream Process Types of Streams How to Stream for Free 2 S A N D Y A U D I O V I S U A L Who

158 views • 13 slides

Embedded Software Streaming Embedded Software Streaming via Block Stream via Block Stream A

Embedded Software Streaming Embedded Software Streaming via Block Stream via Block Stream A Dissertation by Pramote Kucharoen Dissertation Advisor Professor Vincent J. Mooney III 7 April 2004 Outline Outline Introduction Related

1.1k views • 75 slides

Edge-based Segmentation Transform Hough Edge Tracking Linking Edge Detection Canny Edge

Edge-based Marr-Hildreth http://vision.ouc.edu.cn/~zhenghaiyong CVBIOUC ZhaoHaiwei DaiJialun WangRuchen Edge-based Segmentation Transform Hough Edge Tracking Linking Edge Detection Canny Edge Techniques

691 views • 31 slides

Building a Graph Processing System Amitabha Roy (LABOS) 1 X-Stream Graph processing system

X-Stream: A Case Study in Building a Graph Processing System Amitabha Roy (LABOS) 1 X-Stream Graph processing system Single Machine Works on graphs stored Entirely in RAM Entirely in SSD Entirely on Magnetic Disk

992 views • 75 slides

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Nominal Media Clock: Ts (implicit, not distributed) Stream A: ? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps Stream D: from different Talkers A-F Stream E: Listener must Stream F: align

57 views • 3 slides

Parallel Triangle Counting and K-Truss Identification Using Graph-Centric Methods Chad Voegele,

Parallel Triangle Counting and K-Truss Identification Using Graph-Centric Methods Chad Voegele, Yi-Shan Lu, Sreepathi Pai, Keshav Pingali The University of Texas at Austin 09/13/2017 1 Graph-Centric vs. Matrix-Centric Abstractions 1 1 1 1

527 views • 26 slides

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming oriented 4 Scan- OLAP OLTP Streaming Archiving oriented 4 Scan- Log- OLAP OLTP Streaming Archiving processing oriented 4 Scan- Log-

1.09k views • 105 slides

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath Kumarasubramanian, Prateek Maheshwari, and Samarth Shetty Stream Processing @ LinkedIn Stream Processing Skills Top App Skills Jobs Streaming input,

520 views • 37 slides

Graph Distances in the Streaming Model Joan Feigenbaum Sampath Kannan Andrew McGregor Siddharth

Graph Distances in the Streaming Model Joan Feigenbaum Sampath Kannan Andrew McGregor Siddharth Suri & Jian Zhang The Streaming Model The Streaming Model Classic Problem: Median Finding The Streaming Model Classic Problem:

1.3k views • 106 slides

HTTP Random access and live content (status update) Craig Pratu, Barbara Stark, Darshak Thakore

HTTP Random access and live content (status update) Craig Pratu, Barbara Stark, Darshak Thakore WG draft Review requested at IETF 98 No specifjc concerns/comments received Working on testjng the protocol Main focus on

111 views • 7 slides

Magnetic Random Access Memory (STT-MRAM) Kui Cai 1 , K.A.S Immink 2 , and Zhen Mei 1 Advanced

Cascaded Channel Model, Analysis, and Hybrid Decoding for Spin-Torque Transfer Magnetic Random Access Memory (STT-MRAM) Kui Cai 1 , K.A.S Immink 2 , and Zhen Mei 1 Advanced Coding and Signal Processing Lab 1 Singapore University of Technology and

615 views • 16 slides

Maintainability and Performance for LAMMPS Chris8an Tro: ,

Photos placed in horizontal position with even amount of white space between photos and header Maintainability and Performance for LAMMPS Chris8an Tro: , Tzu-Ray Shan, Stan Moore, Aidan

590 views • 32 slides

CDW Corporation Webcast Conference Call November 1, 2017 CDW .com | 8 0 0 .8 0 0 .4 2 3 9

CDW Corporation Webcast Conference Call November 1, 2017 CDW .com | 8 0 0 .8 0 0 .4 2 3 9 Today's Agenda Third Quarter and YTD 2017 Results Key Performance Drivers and Strategic Progress Financial Results Outlook Q&A

584 views • 25 slides

Automating & & O Optimizing J Job A Advertising with AI AI-En Enabled Algorithms

Automating & & O Optimizing J Job A Advertising with AI AI-En Enabled Algorithms ITS THE FASTEST WAY THROUGH THE RECRUITMENT MARKETING FUNNEL [ IF DONE CORRECTLY ] Employers Spend Job Advertising Strategies Employer Branding 30%

503 views • 22 slides

HDR Photography Another Tool for the Photographer High Dynamic Range What does it mean?

HDR Photography Another Tool for the Photographer High Dynamic Range What does it mean? Ratio between the maximum & minimum measureable light intensities (white & black) The human eye can see 24 stops, if allowed to adjust for

419 views • 15 slides

Towards a Benchmarking Raster: A selection of indicators to measure policies for culture

Towards a Benchmarking Raster: A selection of indicators to measure policies for culture and creative industries CREA.RE Mid-term conference Gteborg, 20-21 March 2011 Aims of the assignment Create and test a benchmarking

609 views • 22 slides

A study of Geographical information systems. Dr.Jean A. Doumit Associate professor Lebanese

Discovery of crustal movements areas in Lebanon: A study of Geographical information systems. Dr.Jean A. Doumit Associate professor Lebanese University. Advisor Arab Union of Surveyors Dr. Jean Doumit 1 Table of contents Introduction

767 views • 17 slides

X-Stream: Edge-centric Graph Processing using Streaming Partitions - PowerPoint PPT Presentation

X-Stream: Edge-centric Graph Processing using Streaming Partitions AMITABHA ROY, IVO MIHAILOVIC, WILLY ZWAENEPOEL PRESENTED BY: MAREK STRELEC Motivation q Large graphs billions of vertices and edges q Process on large clusters q Pregel,

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge

X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic,

GraVF: GraVF: A Vertex-Centric A Vertex-Centric Graph Processing Graph Processing Framework

X-Stream: Edge-centric Graph Processing using Streaming Partitions Amitabha Roy, Ivo Mihailovic,

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Batch &amp; Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri

Training Presentation Web Streaming Introduction What is Web Streaming? Who is Streaming?

Embedded Software Streaming Embedded Software Streaming via Block Stream via Block Stream A

Edge-based Segmentation Transform Hough Edge Tracking Linking Edge Detection Canny Edge

Building a Graph Processing System Amitabha Roy (LABOS) 1 X-Stream Graph processing system

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Parallel Triangle Counting and K-Truss Identification Using Graph-Centric Methods Chad Voegele,

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath

Graph Distances in the Streaming Model Joan Feigenbaum Sampath Kannan Andrew McGregor Siddharth

HTTP Random access and live content (status update) Craig Pratu, Barbara Stark, Darshak Thakore

Magnetic Random Access Memory (STT-MRAM) Kui Cai 1 , K.A.S Immink 2 , and Zhen Mei 1 Advanced

Maintainability and Performance for LAMMPS Chris8an Tro: ,

CDW Corporation Webcast Conference Call November 1, 2017 CDW .com | 8 0 0 .8 0 0 .4 2 3 9

Automating &amp; &amp; O Optimizing J Job A Advertising with AI AI-En Enabled Algorithms

HDR Photography Another Tool for the Photographer High Dynamic Range What does it mean?

Towards a Benchmarking Raster: A selection of indicators to measure policies for culture

A study of Geographical information systems. Dr.Jean A. Doumit Associate professor Lebanese

Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri

Automating & & O Optimizing J Job A Advertising with AI AI-En Enabled Algorithms