Prioritized Restreaming Algorithms for Balanced Graph Partitioning Amel Awadelkarim ameloa@stanford.edu Johan Ugander jugander@stanford.edu
Balanced graph partitioning We want to partition a graph into node-sets of approximately equal size, while minimizing the number of edges cut. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 1 Partitioning
Balanced graph partitioning This problem has practical application as an imperative step for large-scale distributed graph computation. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 2 Partitioning
Existing algorithms Global Multilevel Local The exact solution is infeasible to compute, hence we focus on iterative local heuristics. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 3 Partitioning
Existing algorithms Global Multilevel Local The exact solution is infeasible to compute, hence we focus on iterative local heuristics. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 4 Partitioning
A new class of algorithms Global Multilevel Local Streaming Prioritized Specifically, we explore the role of stream order in (re)streaming algorithms and introduce prioritized restreaming algorithms. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 5 Partitioning
Contributions 1. A taxonomy of existing iterative techniques 2. Informative benchmarking that was absent from the literature 3. A paradigm shift in restreaming partitioning algorithms Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 6 Partitioning
Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 7 Partitioning
Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 8 Partitioning
Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 9 Partitioning
Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 10 Partitioning
Existing methods We present three algorithms from the literature – two based on label propagation and one restreaming algorithm. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 11 Partitioning
Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 BLP begins from an initial partitioning, iteratively improving upon the edge cut objective. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 12 Partitioning
Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 At each iteration, BLP identifies which nodes desire to move and to where , Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 13 Partitioning
<latexit sha1_base64="VktvdKHwt3E3Sy2VExU8nKgVHfw=">ACE3icbZDLSsNAFIYn9V5vUZduBkVQ0ZLUhRcQvGxcSQWrhTSEyXTaDp1M4lzEvIObnwTceNCEbdu3Pk2ThsX2vrDwMd/zuHM+cOEUakc58sqjIyOjU9MThWnZ2bn5u2FxSsZa4FJFcsFrUQScIoJ1VFSO1RBAUhYxch53TXv36lghJY36pugnxI9TitEkxUsYK7M1WoOEhrEfoLkhpnXKv42fwPEj1Fs3gdk6wsq43sBedUpOX3AY3B9YPTrObk73Dx4rgf1Zb8RYR4QrzJCUnuskyk+RUBQzkhXrWpIE4Q5qEc8gRxGRftq/KYNrxmnAZizM4wr23d8TKYqk7Eah6YyQasvBWs/8r+Zp1dzU8oTrQjH+aKmZlDFsBcQbFBsGJdAwgLav4KcRsJhJWJsWhCcAdPHoarcsndKZUvTBonINckWAYrYB24YBcgTNQAVWAwT14Ai/g1Xqwnq036z1vLVg/M0vgj6yPbxGwn1A=</latexit> Balanced label propagation Ugander and Backstrom. WSDM. 2013. Gain of node u g u = max i ∈ [ k ] N u,i − N u,P ( u ) Neighbors in current shard assignment, P(u) Neighbors in shard i places nodes in sorted move queues to their target shards by order of decreasing gain , Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 14 Partitioning
Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 then solves a linear program to determine how many top nodes to relocate. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 15 Partitioning
Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 then solves a linear program to determine how many top nodes to relocate. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 16 Partitioning
Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 then solves a linear program to determine how many top nodes to relocate. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 17 Partitioning
Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 SHP also starts from an initial partitioning. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 18 Partitioning
Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 At each iteration, we place all nodes in the move queue of the shard that maximizes a modified form of gain, Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 19 Partitioning
<latexit sha1_base64="S53pKRdZUXcmEJCT5lRYHr+o24=">ACInicbZDLSsNAFIYn3q23qEs3gyIqaEnqwgsIXjaupIJVoQlhMp2Q2cmcS5iCXkWNz6Cr+DGhaKuB/GaePC2w8DH/85hzPnj1NGlfa8d2dgcGh4ZHRsvDQxOTU9487OnavESExqOGJvIyRIowKUtNUM3KZSoJ4zMhF3Dnq1S+uiVQ0EWe6m5KQo5agTYqRtlbk7rRWIgP3YMDRTZTRgIp6JwU0ZwKo2B1azl8CTKzDrN4UZBhRu5S17Z6wv+Bf8LlvYP8qujnd37auS+Bo0EG06ExgwpVfe9VIcZkpiRvJSYBRJEe6gFqlbFIgTFWb9E3O4bJ0GbCbSPqFh3/0+kSGuVJfHtpMj3Va/az3zv1rd6OZ2mFGRGk0ELhY1DYM6gb28YINKgjXrWkBYUvtXiNtIqxtqiUbgv/75L9wXin7m+XKqU3jEBQaAwtgEawCH2yBfXAMqAGMLgFD+AJPDt3zqPz4rwVrQPO18w8+CHn4xMjnqUX</latexit> Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. g 0 i 2 [ k ] \ P ( u ) N u,i − N u,P ( u ) u = max Max over external shards the max gain outside of a node’s current shard assignment, and sort move queues by this quantity. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 20 Partitioning
Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 Balance is maintained by swapping nodes between shard pairs, only doing so when the net gain is positive. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 21 Partitioning
Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 Balance is maintained by swapping nodes between shard pairs, only doing so when the net gain is positive. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 22 Partitioning
Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 Balance is maintained by swapping nodes between shard pairs, only doing so when the net gain is positive. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 23 Partitioning
Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 The SHP algorithm boasts many bells and whistles. We denote this version KL-SHP and also study two restricted forms, SHP-I and SHP-II . Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 24 Partitioning
Restreaming linear deterministic greedy Nishimura and Ugander. KDD . 2013. Stanton and Kliot. KDD. 2012. V 1 V 2 V 3 ReLDG is a streaming algorithm, and does not require an initial partitioning. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 25 Partitioning
Restreaming linear deterministic greedy Nishimura and Ugander. KDD . 2013. Stanton and Kliot. KDD. 2012. V 1 V 2 V 3 It repeatedly streams nodes one at a time to the shard that satisfies the given assignment mechanism. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 26 Partitioning
Restreaming linear deterministic greedy Nishimura and Ugander. KDD . 2013. Stanton and Kliot. KDD. 2012. V 1 V 2 V 3 It repeatedly streams nodes one at a time to the shard that satisfies the given assignment mechanism. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 27 Partitioning
Recommend
More recommend