Parallel Traveling Salesman PhD Student: Viet Anh Trinh Advisor: Professor Feng Gu www.gc.cuny.edu
Agenda 1. Traveling salesman introduction 2. Genetic Algorithm for TSP 3. Tree Search for TSP www.gc.cuny.edu
www.gc.cuny.edu
Travelling Salesman - Set of N cities - Find the shortest closed non-looping path that covers all the cities - No city be visited more than once www.gc.cuny.edu
Travelling Salesman First Parallel Approach: Genetic Algorithm www.gc.cuny.edu
Travelling Salesman -Sequential Genetic Algorithm Initialization 0123, 0231,1320, 0321 Fitness Evaluation (0123) = 1/(1 + 2 + 10 + 7) = 0.050 (0231) = 1/(3 + 10 + 4 + 5) =0.045 (1320) =1/(6 + 12 + 1 + 1) = 0.050 (0321) = 1/(8 + 12 + 18 + 5) = 0.023 Selection Cross-over Mutation Termination www.gc.cuny.edu
Sequential GA Travelling Salesman - Individuals - Closed non-looping paths across all cities - Initial - Set of randomly generated paths Population - Evaluation - Assess the fitness of the individual. Fitness is 1/ total distance of a given path - Selection - Select the fittest individuals ( biggest fitness, smallest distance) - Offspring - Cross-over + mutation production
Selection Roulette Wheel Selection: 1st path 2nd path 3rd path 4th path P(choice – (0123)) = 0.05/(0.05+ 0.045 + 0.05 + 0.023) = 0.3 P(choice – (0231) = 0.27 P(choice – (1320)) = 0.3 P(choice – (0321)) = 0.13 If random number fall in: 0 ≤ 𝑠 < 0.3 choose 0123 0.3 ≤ 𝑠 < 0.57 choose 0231 0.57 ≤ 𝑠 < 0.87 choose 1320 0.87 ≤ 𝑠 < 1 choose 0321
Specialized Crossover Operator Normal Crossover: Invalid path appear Order Crossover(OX): No invalid part
Mutation - Select 2 random point and swap - Ensure the valid path
Sequential Genetic Algorithm
Parallel Genetic Algorithm Master: Initialization 0123, 0231,1320, 0321 Slave 2 Slave 1 Fitness Evaluation Fitness Evaluation (0123) = 1/(1 + 2 + 10 + 7) = 0.050 (0231) = 1/(3 + 10 + 4 + 5) =0.045 (1320) =1/(6 + 12 + 1 + 1) = 0.050 (0321) = 1/(8 + 12 + 18 + 5) = 0.023 Selection Selection Cross-over Cross-over Mutation Mutation Termination Termination
Parallel Travelling Salesman - Master • Master - Initializes population - Sends path to slaves - Examine the best paths from slaves’ return results • Slave - Signals the master that it is ready for work - Waits for paths to be sent by the master until a termination message is received - Evaluates the paths fitness - Selection - Crossover - Mutation - Sends the best c paths to nearby neighbors after k generations - When finish, send best paths and their lengths to master
Time Complexity Sequential Time Complexity: n :population size l : length of a path, number of cities g : number of generation Sequential Genetic Algorithm: Initialization : O(n) Evaluation: O(nl) Selection: O(nl) Crossover: c1x O(nl) Mutation: c2 x O(nl) Time: O(nl) + gO(nl) =O(gnl)
Time Complexity Parallel Isolated subpopulations • Stepping model model: only send best individuals to neighbor • processor Communication time • Master send data to slave using scatter: t comm1 = O(nl/p) Slave send best c paths to neighbor processor after k generations: t comm2 = g/kO(cl) = g/kO(l) Slave send their c best paths and their length value to Master: t comm3 = O(cl) Computation time • Master Initialization : t comp1 =O(n) Slave evaluation, selection, crossover, mutation: t comp2 = O(gnl/p) Master final evaluation : t comp3 =O(pc) Parallel time : t p = O(gnl/p) • Speed up = t s /t p = p • Efficiency = t s /pt p =1 •
Travelling Salesman Second Parallel Approach: Tree Search www.gc.cuny.edu
Travelling Salesman www.gc.cuny.edu
Travelling Salesman – Tree Search www.gc.cuny.edu
Travelling Salesman Sequential Algorithm www.gc.cuny.edu
Travelling Salesman Sequential Algorithm - City count: examines the partial tour if there are n cities on the partial tour. - Best tour: check if the complete tour has a lower cost than “best tour” - Update best tour: replace the current best tour with this tour - Feasible: checks to see if the city or vertex has already been visited. www.gc.cuny.edu
Travelling Salesman Sequential www.gc.cuny.edu
Travelling Salesman Sequential www.gc.cuny.edu
Travelling Salesman Sequential www.gc.cuny.edu
Travelling Salesman Parallel Static load balancing (picture) à Imbalance load Solution à Dynamic load balancing www.gc.cuny.edu
Travelling Salesman Parallel www.gc.cuny.edu
Travelling Salesman Parallel Terminologies • Donor process: the process that sends work • Recipient process: the process that requests/receives work • Half-split: ideally, the stack is split into two equal pieces such that the search space of each stack is the same • Cutoff depth: to avoid sending very small amounts of work, nodes beyond a specified stack depth are not given away www.gc.cuny.edu
Travelling Salesman Parallel Some possible strategies 1. Send nodes near the bottom of the stack • Works well with uniform search space; has low splitting cost 2. Send nodes near the cutoff depth • Performs better with a strong heuristic (tries to distribute the parts of the search space likely to contain a solution) 3. Send half the nodes between the bottom and the cutoff depth • Works well with uniform and irregular search space www.gc.cuny.edu
Travelling Salesman Parallel www.gc.cuny.edu
Travelling Salesman Parallel The entire space is assigned to master When slave runs out of work, it gets more work from • another slave using work requests and responses • Unexplored states can be conveniently stored as local stacks at processors. • Slave terminate when reaching final state www.gc.cuny.edu
Travelling Salesman Parallel • Load balancing scheme: Random polling (RP) When a processor becomes idle, it randomly selects a donor. Each processor is selected as a donor with equal probability, ensuring that work requests are evenly distributed. www.gc.cuny.edu
Travelling Salesman Parallel • Let W be serial work and pW p be parallel work. • Search overhead factor s is defined as pW P /W • Quantify total overhead T o in terms of W to compute scalability. § T o = pW p – W • Upper bound on speed up is p×1/s. www.gc.cuny.edu
Travelling Salesman Parallel Assumption: • Search overhead factor = one • Work at any processor can be partitioned into independent pieces as long as its size exceeds a threshold ε. • A reasonable work-splitting mechanism is available. § If work w at a processor is split into two parts ψw and (1–ψ)w , there exists an arbitrarily small constant α (0 < α ≤ 0.5),such that ψw > αw and (1–ψ)w > αw . § The constant α sets a lower bound on the load imbalance from work splitting. www.gc.cuny.edu
Travelling Salesman Parallel If processor P i initially had work w i , after a single request by processor P j • and split, neither P i nor P j have more than (1–α)w i work. For each load balancing strategy, we define V(P) as the total number of • work requests after which each processor receives at least one work request (note that V(p) ≥ p ). Assume that the largest piece of work at any point is W. • After V(p) requests, the maximum work remaining at any processor is less • than (1–α)W; after 2V(p) requests, it is less than (1–α) 2 W ; … After (log 1/1(1- α ) (W/ε))V(p) requests, the maximum work remaining at any • processor is below a threshold value ε. The total number of work requests is O(V(p) log W). • www.gc.cuny.edu
Travelling Salesman Parallel If t comm is the time required to communicate a piece of work, then • the communication overhead T O is T O = t comm V(p)log W The corresponding efficiency E is given by: • www.gc.cuny.edu
Travelling Salesman Parallel Random Polling • § Worst case V(p) is unbounded. § We do average case analysis . Let F(i,p) represent a state in which i of the processors have been • requested, and p–i have not. Let f(i,p) denote the average number of trials needed to change from • state F(i,p) to F(p,p) (V(p) = f(0,p)). www.gc.cuny.edu
Travelling Salesman Parallel We have • As p becomes large, H p ≃ 1.69 ln p. Thus, V(p) = O(p log p) . T o = O(p log p log W) Therefore W = O(p log 2 p). www.gc.cuny.edu
END OF PRESENTATION THANK YOU ! www.gc.cuny.edu
Recommend
More recommend