prioritized restreaming algorithms for balanced graph
play

Prioritized Restreaming Algorithms for Balanced Graph Partitioning - PowerPoint PPT Presentation

Prioritized Restreaming Algorithms for Balanced Graph Partitioning Amel Awadelkarim ameloa@stanford.edu Johan Ugander jugander@stanford.edu Balanced graph partitioning We want to partition a graph into node-sets of approximately equal size,


  1. Prioritized Restreaming Algorithms for Balanced Graph Partitioning Amel Awadelkarim ameloa@stanford.edu Johan Ugander jugander@stanford.edu

  2. Balanced graph partitioning We want to partition a graph into node-sets of approximately equal size, while minimizing the number of edges cut. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 1 Partitioning

  3. Balanced graph partitioning This problem has practical application as an imperative step for large-scale distributed graph computation. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 2 Partitioning

  4. Existing algorithms Global Multilevel Local The exact solution is infeasible to compute, hence we focus on iterative local heuristics. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 3 Partitioning

  5. Existing algorithms Global Multilevel Local The exact solution is infeasible to compute, hence we focus on iterative local heuristics. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 4 Partitioning

  6. A new class of algorithms Global Multilevel Local Streaming Prioritized Specifically, we explore the role of stream order in (re)streaming algorithms and introduce prioritized restreaming algorithms. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 5 Partitioning

  7. Contributions 1. A taxonomy of existing iterative techniques 2. Informative benchmarking that was absent from the literature 3. A paradigm shift in restreaming partitioning algorithms Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 6 Partitioning

  8. Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 7 Partitioning

  9. Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 8 Partitioning

  10. Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 9 Partitioning

  11. Outline Existing methods Taxonomy Prioritized restreaming Results • Benchmark existing methods • Prioritized restreaming results • Correlation between stream orders Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 10 Partitioning

  12. Existing methods We present three algorithms from the literature – two based on label propagation and one restreaming algorithm. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 11 Partitioning

  13. Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 BLP begins from an initial partitioning, iteratively improving upon the edge cut objective. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 12 Partitioning

  14. Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 At each iteration, BLP identifies which nodes desire to move and to where , Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 13 Partitioning

  15. <latexit sha1_base64="VktvdKHwt3E3Sy2VExU8nKgVHfw=">ACE3icbZDLSsNAFIYn9V5vUZduBkVQ0ZLUhRcQvGxcSQWrhTSEyXTaDp1M4lzEvIObnwTceNCEbdu3Pk2ThsX2vrDwMd/zuHM+cOEUakc58sqjIyOjU9MThWnZ2bn5u2FxSsZa4FJFcsFrUQScIoJ1VFSO1RBAUhYxch53TXv36lghJY36pugnxI9TitEkxUsYK7M1WoOEhrEfoLkhpnXKv42fwPEj1Fs3gdk6wsq43sBedUpOX3AY3B9YPTrObk73Dx4rgf1Zb8RYR4QrzJCUnuskyk+RUBQzkhXrWpIE4Q5qEc8gRxGRftq/KYNrxmnAZizM4wr23d8TKYqk7Eah6YyQasvBWs/8r+Zp1dzU8oTrQjH+aKmZlDFsBcQbFBsGJdAwgLav4KcRsJhJWJsWhCcAdPHoarcsndKZUvTBonINckWAYrYB24YBcgTNQAVWAwT14Ai/g1Xqwnq036z1vLVg/M0vgj6yPbxGwn1A=</latexit> Balanced label propagation Ugander and Backstrom. WSDM. 2013. Gain of node u g u = max i ∈ [ k ] N u,i − N u,P ( u ) Neighbors in current shard assignment, P(u) Neighbors in shard i places nodes in sorted move queues to their target shards by order of decreasing gain , Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 14 Partitioning

  16. Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 then solves a linear program to determine how many top nodes to relocate. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 15 Partitioning

  17. Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 then solves a linear program to determine how many top nodes to relocate. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 16 Partitioning

  18. Balanced label propagation Ugander and Backstrom. WSDM. 2013. V 1 V 2 V 3 then solves a linear program to determine how many top nodes to relocate. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 17 Partitioning

  19. Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 SHP also starts from an initial partitioning. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 18 Partitioning

  20. Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 At each iteration, we place all nodes in the move queue of the shard that maximizes a modified form of gain, Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 19 Partitioning

  21. <latexit sha1_base64="S53pKRdZUXcmEJCT5lRYHr+o24=">ACInicbZDLSsNAFIYn3q23qEs3gyIqaEnqwgsIXjaupIJVoQlhMp2Q2cmcS5iCXkWNz6Cr+DGhaKuB/GaePC2w8DH/85hzPnj1NGlfa8d2dgcGh4ZHRsvDQxOTU9487OnavESExqOGJvIyRIowKUtNUM3KZSoJ4zMhF3Dnq1S+uiVQ0EWe6m5KQo5agTYqRtlbk7rRWIgP3YMDRTZTRgIp6JwU0ZwKo2B1azl8CTKzDrN4UZBhRu5S17Z6wv+Bf8LlvYP8qujnd37auS+Bo0EG06ExgwpVfe9VIcZkpiRvJSYBRJEe6gFqlbFIgTFWb9E3O4bJ0GbCbSPqFh3/0+kSGuVJfHtpMj3Va/az3zv1rd6OZ2mFGRGk0ELhY1DYM6gb28YINKgjXrWkBYUvtXiNtIqxtqiUbgv/75L9wXin7m+XKqU3jEBQaAwtgEawCH2yBfXAMqAGMLgFD+AJPDt3zqPz4rwVrQPO18w8+CHn4xMjnqUX</latexit> Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. g 0 i 2 [ k ] \ P ( u ) N u,i − N u,P ( u ) u = max Max over external shards the max gain outside of a node’s current shard assignment, and sort move queues by this quantity. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 20 Partitioning

  22. Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 Balance is maintained by swapping nodes between shard pairs, only doing so when the net gain is positive. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 21 Partitioning

  23. Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 Balance is maintained by swapping nodes between shard pairs, only doing so when the net gain is positive. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 22 Partitioning

  24. Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 Balance is maintained by swapping nodes between shard pairs, only doing so when the net gain is positive. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 23 Partitioning

  25. Social Hash partitioner Kabiljo et al. VLDB. 2017. Shalita et al. NSDI. 2016. V 1 V 2 V 3 The SHP algorithm boasts many bells and whistles. We denote this version KL-SHP and also study two restricted forms, SHP-I and SHP-II . Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 24 Partitioning

  26. Restreaming linear deterministic greedy Nishimura and Ugander. KDD . 2013. Stanton and Kliot. KDD. 2012. V 1 V 2 V 3 ReLDG is a streaming algorithm, and does not require an initial partitioning. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 25 Partitioning

  27. Restreaming linear deterministic greedy Nishimura and Ugander. KDD . 2013. Stanton and Kliot. KDD. 2012. V 1 V 2 V 3 It repeatedly streams nodes one at a time to the shard that satisfies the given assignment mechanism. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 26 Partitioning

  28. Restreaming linear deterministic greedy Nishimura and Ugander. KDD . 2013. Stanton and Kliot. KDD. 2012. V 1 V 2 V 3 It repeatedly streams nodes one at a time to the shard that satisfies the given assignment mechanism. Prioritized Restreaming Algorithms for Balanced Graph Awadelkarim and Ugander 27 Partitioning

Recommend


More recommend