SLIDE 1
Single Source Shortest Paths (SSSP) Directed graph Edge weights - - PowerPoint PPT Presentation
Single Source Shortest Paths (SSSP) Directed graph Edge weights - - PowerPoint PPT Presentation
Single Source Shortest Paths (SSSP) Directed graph Edge weights Shortest path from to : Path = 0 , 1 , , of minimum weight () 5 3 where 0 = , = and 4 7 4 1
SLIDE 2
SLIDE 3
Single Source Shortest Paths (SSSP)
Parent of a vertex Shortest paths tree Formed by the edges (π(π€), π€) π π€ = vertex just before π€ on the shortest path from π‘ π’ π‘ π¦ π¨ 3 7 3 4 5 6 4 7 1 1 π(π€) π‘ π€ π(π‘) = - π(π’) = π‘ π(π¦) = π’ π(π§) = π’ π(π¨) = π§ π§
SLIDE 4
Single Source Shortest Paths (SSSP)
Temporary distances π(π€) = upper bound for the weight of the shortest path from π‘ to π€ Initialize Edge relaxation π(π€) β null, π(π€) β β for all π€ β π‘ π(π‘) β null, π(π‘) β 0 relax(π£, π€) if if π(π€) > π(π£) + π₯(π£, π€) th then { π(π€) β π(π£) + π₯(π£, π€) π(π€) β π£ } 2 π(π£) = 5 π£ π€ π(π€) = 8 2 π(π£) = 5 π£ π€ π(π) = π 2 π(π£) = 5 π£ π€ π(π€) = 6 2 π(π£) = 5 π£ π€ π(π€) = 6
SLIDE 5
Single Source Shortest Paths (SSSP)
Dijkstraβs Algorithm Used when edge weights are non-negative It maintains a set of vertices π β π for which a shortest path has been computed, i.e., the value of π(π€) is the exact weight of the shortest path to π€. Each iteration selects a vertex π£ β π\S with minimum distance π(π£). Then we set S β π βͺ π£ and relax all edges (π£, π₯) To find π£ with min π(π£): Use a priority queue π with keys
SLIDE 6
Single Source Shortest Paths (SSSP)
Dijkstraβs Algorithm Initialization π(π€) β null, π(π€) β β for all π€ β π‘ π(π‘) β null, π(π‘) β 0 insert all vertices π€ into priority queue π with key π(π€) set π β β Main Loop while π is not empty { π£ β Q. delMin() π β π βͺ π£ for all edges (π£, π€) { relax(π£, π€) } }
SLIDE 7
Single Source Shortest Paths (SSSP)
Dijkstraβs Algorithm Initialization π(π€) β null, π(π€) β β for all π€ β π‘ π(π‘) β null, π(π‘) β 0 insert all vertices π€ into priority queue π with key π(π€) set π β β Main Loop while π is not empty { π£ β Q. delMin() π β π βͺ π£ for all edges (π£, π€) { relax(π£, π€) } } priority queue π running time array O(π2) binary heap O(π log π) Fibonacci heap O(π + π log π)
SLIDE 8
Single Source Shortest Paths (SSSP) in Map-Reduce
β’ Not easy to parallelize Dijkstraβs algorithm β’ Use an iterative approach instead
- The distance π(π€) from π‘ to π€ is updated by the distances of all π£ with
π£, π€ β πΉ.
- Need to communicate both distances and adjacency lists.
π§ π¦ π€ π₯(π¦, π€) π¨ π₯(π§, π€) π₯(π¨, π€) π(π€) β min π π£ + π₯ π£, π€ | (π£, π€) β πΉ
SLIDE 9
Single Source Shortest Paths (SSSP) in Map-Reduce
Mapper: emits distances and graph structure π§ π¦ π€ π₯(π¦, π€) π¨ π₯(π§, π€) π₯(π¨, π€) π(π€) β min π π£ + π₯ π£, π€ | (π£, π€) β πΉ Reducer: updates distances and emits graph structure π π π€ π π π€ + π₯(π€, π) π π€ + π₯(π€, π) π π€ + π₯(π€, π)
SLIDE 10
Single Source Shortest Paths (SSSP) in Map-Reduce
β’ Not easy to parallelize Dijkstraβs algorithm β’ Use an iterative approach instead
- The distance π(π€) from π‘ to π€ is updated by the distances of all π£ with
π£, π€ β πΉ.
- Need to communicate both distances and adjacency lists.
- Repeat round until all distances are fixed.
- Number of rounds = π β 1 in the worst case.
- If all weights are equal then we compute the Breadth-First Search
(BFS) tree. Number of rounds = graph diameter.
SLIDE 11
BFS in Map-Reduce
SLIDE 12
Single Source Shortest Paths (SSSP) in Map-Reduce
Remarks on Map-Reduce SSSP algorithm
- Essentially a brute-force algorithm.
- Performs many unnecessary computations.
- No global data structure.
SLIDE 13
PageRank in Map-Reduce
Recall the formula for the PageRank π(π£) of a webpage π£
π π£ = π ΰ·
π€βπΆπ£
π(π€) ππ€ + (1 β π)πΉπ£
πΆπ£ = set of pages that point to π£ πΊ
π£ = set of pages that π£ points to
πΊ
π£ = ππ£ = number of links from π£
πΉπ£ = probabilities over web pages πΉπ£ and π are user designed parameters
SLIDE 14
PageRank in Map-Reduce
Iterative computation start with seed values π0(π€) for each page π€ each page π€ receives credit from the pages in πΆπ€ and computes ππ+1(π€) each page π€ distributes credit to the pages in πΊ
π€
SLIDE 15
PageRank in Map-Reduce
SLIDE 16
Algorithms and Complexity in MapReduce (and related models)
Sorting, Searching, and Simulation in the MapReduce Framework
- M. T. Goodrich, N. Sitchinava, and Q. Zhang
ISAAC 2011 Fast Greedy Algorithms in MapReduce and Streaming
- R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani
SPAA 2013 On the Computational Complexity of MapReduce
- B. Fish, J. Kun, A. D. Lelkes, L. Reyzin, and G. Turan
DISC 2015
SLIDE 17
- L. G. Valiant, A Bridging Model for Parallel Computation,
Communications of the ACM, 1990 Computational model of parallel computation BSP is a parallel programming model based on Synchronizer Automata. The model consists of:
- Set of processor-memory pairs.
- Communications network that delivers messages in a point-to-point
manner.
- Mechanism for the efficient barrier synchronization for all or a subset of
the processes.
- No special combining, replicating, or broadcasting facilities.
BSP model
SLIDE 18
- Vertical Structure
Supersteps: β Local computation β Process Communication β Barrier Synchronization
- Horizontal Structure
β Concurrency among a fixed number of virtual processors. β Processes do not have a particular order. β Locality plays no role in the placement of processes on processors.
Virtual Processors Local Computation Global Communication Barrier Synchronization
Implementation: BSPlib
BSP model
SLIDE 19
Simulation on MapReduce: 1. Create a tuple for each memory cell and processor. 2. Map each message to the destination processor label. 3. Reduce by performing one step of a processor, outputting the messages for next round. Theorem [Goodrich et al.]: Given a BSP algorithm π΅ that runs in π supersteps with a total memory size π using π β€ π processors, we can simulate π΅ using O(π) rounds and message complexity O(ππ) in the memory-bound MapReduce framework with reducer memory size bounded by π/π.
MapReduce simulation of a BSP program
SLIDE 20
Simulation on MapReduce: 1. Create a tuple for each memory cell and processor. 2. Map each message to the destination processor label. 3. Reduce by performing one step of a processor, outputting the messages for next round. Theorem [Goodrich et al.]: Given a BSP algorithm π΅ that runs in π supersteps with a total memory size π using π β€ π processors, we can simulate π΅ using O(π) rounds and message complexity O(ππ) in the memory-bound MapReduce framework with reducer memory size bounded by π/π. A corollary of the above: Given the optimal BSP algorithm of [Goodrich, 99], we can sort π values in the MapReduce framework in π(π) rounds and π(ππ) message complexity.
MapReduce simulation of a BSP program
SLIDE 21