One Trillion Edges: Graph Processing at Facebook-Scale GraphHPC 2015, Moscow Avery Ching Sergey Edunov Maja Kabiljo Facebook Facebook Facebook Dionysios Logothetis Sambavi Muthukrishnan Facebook Facebook
Social Graph
Social Graph Example Question: Are Jay and Sambavi friends?
Ranking Features 7.6 9.3 6.4 8.2
Ranking Features Recommendations 7.6 9.3 6.4 8.2
Ranking Features Recommendations Data Partitioning 7.6 9.3 6.4 8.2
Benchmark Graphs Benchmark to Social Graphs Vertices Clueweb 09 Edges Twitter research Friendster Yahoo! web 0 1750 3500 5250 7000
Benchmark Graphs Benchmark to Social Graphs Vertices Clueweb 09 Edges Twitter research Friendster 70x larger than benchmarks! Yahoo! web 2015 Twitter Approx. 2015 Facebook Approx. 0 125000 1750 250000 3500 375000 5250 50000 7000
Requirements
Requirements Efficient iterative computing model • • Easy to program and debug graph-based API • Scale to real world Facebook graph sizes (1B+ nodes and hundreds of billions of edges) • Easily interoperable with existing data (Hive) • Run multiple jobs in a multi-tenant environment
Apache Giraph Maximum Vertex Example • Highly scalable graph processing engine loosely based on Pregel Combiners are used to aggregate message values • • Aggregators are global data generated on every superset
Apache Giraph Maximum Vertex Example Processor 1 5 Processor 2 1 2 Time • Highly scalable graph processing engine loosely based on Pregel Combiners are used to aggregate message values • • Aggregators are global data generated on every superset
Apache Giraph Maximum Vertex Example Processor 1 5 5 Processor 2 1 1 5 2 2 Time • Highly scalable graph processing engine loosely based on Pregel Combiners are used to aggregate message values • • Aggregators are global data generated on every superset
Apache Giraph Maximum Vertex Example Processor 1 5 5 5 Processor 2 1 5 1 5 2 2 2 5 Time • Highly scalable graph processing engine loosely based on Pregel Combiners are used to aggregate message values • • Aggregators are global data generated on every superset
Apache Giraph Maximum Vertex Example Processor 1 5 5 5 5 Processor 2 1 5 5 1 5 2 5 2 2 5 Time • Highly scalable graph processing engine loosely based on Pregel Combiners are used to aggregate message values • • Aggregators are global data generated on every superset
Pipelines Data Pipelines Framework Core Applications Analytics Execution Framework MapReduce (Scheduler) Storage HDFS
Architecture
Architecture Loading the graph Input Format Split 0 Worker 0 Load/Send Graph Split 1 Master Split 2 Worker 1 Load/Send Graph Split 3
Architecture Loading the graph Compute / Iterate Input Format In-memory graph Split 0 Part 0 Worker 0 Worker 0 Load/Send Compute/ Send Graph Messages Split 1 Part 1 Master Master Split 2 Part 2 Worker 1 Worker 1 Load/Send Compute/ Send Graph Messages Split 3 Part 3 Send stats/iterate!
Architecture Loading the graph Compute / Iterate Storing the graph Input Format In-memory graph Output Format Split 0 Part 0 Part 0 Part 0 Worker 0 Worker 0 Worker 0 Load/Send Compute/ Send Graph Messages Part 1 Split 1 Part 1 Part 1 Master Master Split 2 Part 2 Part 2 Part 2 Worker 1 Worker 1 Worker 0 Load/Send Compute/ Send Graph Messages Split 3 Part 3 Part 3 Part 3 Send stats/iterate!
Parallelization Model Compute / Iterate In-memory graph Part 0 Worker 0 Compute/ Send Messages Part 1 Master Part 2 Worker 1 Compute/ Send Messages Part 3 Send stats/iterate!
Parallelization Model Compute / Iterate Worker parallelization: Rely on scheduling In-memory graph framework for parallelization (e.g. Part 0 Worker 0 Compute/ Send more mappers) Messages Part 1 Master Part 2 Worker 1 Compute/ Send Messages Pros: Simple Part 3 Send stats/iterate!
Parallelization Model Pros: Fewer Compute / Iterate connections, better Worker memory usage (e.g. parallelization: shared message Rely on scheduling In-memory graph buffering) framework for parallelization (e.g. Part 0 Worker 0 Compute/ Send more mappers) Messages Part 1 Master Multithreading Part 2 Worker 1 parallelization: Compute/ Send Messages Pros: Simple Multicore machines Part 3 leverage up to (partitions / worker) Send stats/iterate! threads
Efficient Java Object Support Edges >> vertices (> 2 orders) • /** * Interface for data structures that store out-edges for a vertex. * • Allow custom out edge * @param < I > Vertex id * @param < E > Edge value implementations */ public interface OutEdges<I extends WritableComparable, E extends Writable> extends Iterable<Edge<I, E>>, Writable { Example: Java primitive arrays, • void initialize(Iterable<Edge<I, E>> edges); void initialize(int capacity); FastUtil libraries void initialize(); void add(Edge<I, E> edge); • Serialize messages into large void remove(I targetVertexId); byte arrays int size(); }
Page Rank Map Reduce (Hadoop)
Page Rank Map Reduce (Hadoop)
Page Rank Map Reduce (Hadoop) Giraph public class PageRankComputation extends BasicComputation<LongWritable, DoubleWritable, FloatWritable, DoubleWritable> { public void compute( Vertex<LongWritable, DoubleWritable, FloatWritable> vertex, Iterable<DoubleWritable> messages) { // Calculate new page rank value if (getSuperstep() >= 1) { double sum = 0; for (DoubleWritable message : messages) { sum += message.get(); } vertex.getValue().set(0.15d / getTotalNumVertices() +0.85d * sum); } // Send page rank value to neighbors if (getSuperstep() < 30) { sendMsgToAllEdges(new DoubleWritable(getVertexValue().get() / getNumOutEdges())); } else { voteToHalt(); } } }
Page Rank Map Reduce (Hadoop) Giraph public class PageRankComputation extends BasicComputation<LongWritable, DoubleWritable, FloatWritable, DoubleWritable> { public void compute( Vertex<LongWritable, DoubleWritable, FloatWritable> vertex, Iterable<DoubleWritable> messages) { // Calculate new page rank value if (getSuperstep() >= 1) { double sum = 0; for (DoubleWritable message : messages) { sum += message.get(); } vertex.getValue().set(0.15d / getTotalNumVertices() +0.85d * sum); } // Send page rank value to neighbors if (getSuperstep() < 30) { sendMsgToAllEdges(new DoubleWritable(getVertexValue().get() / getNumOutEdges())); } else { voteToHalt(); } } }
Page Rank Map Reduce (Hadoop) Giraph public class PageRankComputation extends BasicComputation<LongWritable, DoubleWritable, FloatWritable, DoubleWritable> { public void compute( Vertex<LongWritable, DoubleWritable, FloatWritable> vertex, Iterable<DoubleWritable> messages) { // Calculate new page rank value if (getSuperstep() >= 1) { double sum = 0; for (DoubleWritable message : messages) { sum += message.get(); } vertex.getValue().set(0.15d / getTotalNumVertices() +0.85d * sum); } // Send page rank value to neighbors if (getSuperstep() < 30) { sendMsgToAllEdges(new DoubleWritable(getVertexValue().get() / getNumOutEdges())); } else { voteToHalt(); } } }
Pregel Extensions
Pregel Model Limitations
Pregel Model Limitations Difficult to construct “multi-stage” graph applications •
Pregel Model Limitations Difficult to construct “multi-stage” graph applications • • Hard to reuse code
Pregel Model Limitations Difficult to construct “multi-stage” graph applications • • Hard to reuse code
Extensions
Extensions • Make Computation first class object
Extensions • Make Computation first class object • Define computation on a master, worker, and a vertex
Extensions • Make Computation first class object • Define computation on a master, worker, and a vertex Master computation is executed centrally to set the • computation, combiner for the workers
Extensions • Make Computation first class object • Define computation on a master, worker, and a vertex Master computation is executed centrally to set the • computation, combiner for the workers • All computations are now composable and reusable
First Class Computation
First Class Computation class Vertex { public: virtual void Compute(MessageIterator* msgs) = 0; … };
First Class Computation class Vertex { public: … }; public interface Computation { void compute(Vertex<I, V, E> vertex, Iterable<M1> messages); … }
Recommend
More recommend