5 2 1 2 3 1 8 3 4 5 6 4 7 6 7 8 (a) (b) 5 2 1 5 2 1 5 2 1 5 2 1 5 2 up up left down 1 8 3 8 3 4 8 3 4 8 3 4 3 4 7 6 4 7 6 7 6 7 6 7 8 6 left down 1 2 3 1 2 3 1 2 1 2 up up left 4 5 6 4 5 4 5 3 4 5 3 7 8 7 8 6 7 8 6 7 8 6 Last tile moved Blank tile (c) Figure 11.1 An 8-puzzle problem instance: (a) initial configuration; (b) final configuration; and (c) a sequence of moves leading from the initial to the final configuration.
Terminal node (non-goal) Non-terminal node x 1 = 0 x 1 = 1 Terminal node (goal) x 2 = 0 x 2 = 1 x 3 = 0 x 3 = 1 x 3 = 0 x 3 = 1 x 4 = 1 x 4 = 0 x 4 = 1 x 4 = 0 f ( x ) = 0 f ( x ) = 2 The graph corresponding to the 0/1 integer-linear-programming problem. Figure 11.2
1 1 2 4 2 4 3 5 3 6 5 6 7 7 7 8 9 8 9 8 9 (a) 1 1 2 3 2 3 4 4 4 5 6 5 6 5 6 7 7 7 7 7 8 9 8 9 8 9 8 9 8 9 10 10 10 10 10 10 10 10 10 (b) Figure 11.3 Two examples of unfolding a graph into a tree.
7 2 3 A 4 6 5 1 8 down right Step 1 7 2 3 7 2 3 B 4 6 4 6 5 C 1 8 5 1 8 up right down Step 2 7 2 3 7 2 7 2 3 D 4 6 5 E 4 6 3 4 6 F Blank tile 1 8 1 8 5 1 8 5 up right The last tile moved Step 3 7 2 3 7 2 G 4 6 H 4 6 3 1 8 5 1 8 5 Figure 11.4 States resulting from the first three steps of depth-first search applied to an instance of the 8-puzzle.
1 Bottom of the stack 5 2 3 4 5 1 4 5 4 6 7 8 9 3 8 9 9 10 11 7 11 8 11 12 13 14 10 14 14 15 16 17 13 16 17 17 18 19 15 19 16 19 20 21 18 24 22 23 24 21 23 24 23 Current State Top of the stack (a) (b) (c) Figure 11.5 Representing a DFS tree: (a) the DFS tree; successor nodes shown with dashed lines have already been explored; (b) the stack storing untried alternatives only; and (c) the stack storing untried alternatives along with their parent. The shaded blocks represent the parent state and the block to the right represents successor states that have not been explored.
7 2 3 1 2 3 Blank Tile 4 6 5 4 5 6 1 8 7 8 The last tile moved (a) (b) 7 2 3 7 2 3 6 4 6 5 6 4 6 5 1 8 1 8 Step 1 Step 1 7 2 3 7 2 3 7 2 3 7 2 3 7 7 7 4 6 4 6 5 7 4 6 4 6 5 1 8 5 1 8 1 8 5 1 8 Step 2 7 2 7 2 3 8 4 6 3 4 6 6 1 8 5 1 8 5 7 2 3 7 2 3 6 6 4 6 5 4 6 5 1 8 1 8 Step 1 Step 1 7 2 3 7 2 3 7 2 3 7 2 3 7 4 6 4 6 5 7 7 4 6 4 6 5 7 1 8 5 1 8 1 8 5 1 8 Step 2 Step 2 Step 4 7 2 7 2 3 7 2 7 2 3 7 2 3 7 2 3 8 4 6 3 4 6 6 8 4 6 3 4 6 4 6 5 4 5 1 8 5 1 8 5 1 8 5 1 8 5 1 8 1 6 8 6 8 8 Step 3 7 2 3 7 3 7 2 3 7 2 3 7 3 7 2 3 7 7 7 7 4 8 6 4 2 6 4 6 4 8 6 4 2 6 4 6 1 5 1 8 5 1 8 5 1 5 1 8 5 1 8 5 7 7 (c) Figure 11.6 Applying best-first search to the 8-puzzle: (a) initial configuration; (b) final configura- tion; and (c) states resulting from the first four steps of best-first search. Each state is labeled with its h -value (that is, the Manhattan distance from the state to the final state).
A B C D E F (a) (b) Figure 11.7 The unstructured nature of tree search and the imbalance resulting from static parti- tioning.
Service any pending messages Do a fixed amount of work Finished Got available work work Processor active Processor idle Select a processor and Service any pending request work from it messages Got a reject Issued a request Figure 11.8 A generic scheme for dynamic load balancing.
1 1 3 4 3 5 5 4 7 8 7 9 8 9 10 10 11 14 11 13 14 13 16 17 16 15 17 19 18 19 24 Cutoff depth 21 23 22 23 24 Current State (b) (a) Figure 11.9 Splitting the DFS tree in Figure 11.5. The two subtrees along with their stack repre- sentations are shown in (a) and (b).
w 0 = 0 . 25 w 0 = 0 . 5 w 0 = 0 . 5 w 1 = 0 . 5 w 1 = 0 . 25 w 1 = 0 . 25 w 3 = 0 . 25 w 2 = 0 . 25 w 2 = 0 . 25 Step 1 Step 2 Step 3 w 0 = 0 . 25 w 0 = 0 . 5 w 0 = 1 . 0 w 1 = 0 . 5 w 3 = 0 . 25 w 1 = 0 . 5 Step 4 Step 5 Step 6 Figure 11.10 Tree-based termination detection. Steps 1–6 illustrate the weights at various pro- cessors after each work transfer.
700 600 ARR Speedup GRR 500 RP 400 300 200 100 0 0 200 400 600 800 1000 1200 p Figure 11.11 Speedups of parallel DFS using ARR, GRR and RP load-balancing schemes.
900000 700000 Number of work requests 500000 GRR Expected (GRR) RP Expected (RP) 300000 100000 0 0 200 400 600 800 1000 1200 p Figure 11.12 Number of work requests generated for RP and GRR and their expected values ( O ( p log 2 p ) and O ( p log p ) respectively).
2.5e+07 W 2e+07 1.5e+07 1e+07 E = 0.64 E = 0.74 5e+06 E = 0.85 E = 0.90 0 0 20000 40000 60000 80000 100000 120000 2 p log p Figure 11.13 Experimental isoefficiency curves for RP for different efficiencies.
Number of busy Processors Time (a) Number of busy Processors Time (b) Number of busy Processors Time (c) Figure 11.14 Three different triggering mechanisms: (a) a high triggering frequency leads to high load-balancing cost, (b) the optimal frequency yields good performance, and (c) a low frequency leads to high idle times.
Global pointer 1 2 3 4 5 6 Idle 5 6 7 1 2 3 4 Global pointer Busy Figure 11.15 Mapping idle and busy processors with the use of a global pointer.
Global list maintained at designated processor Put expanded nodes Get current best node Lock the list Lock the list Place generated Place generated Lock the list nodes in the list nodes in the list Pick the best node Pick the best node Place generated from the list nodes in the list from the list Pick the best node Unlock the list Unlock the list from the list Expand the node to Expand the node to Unlock the list generate successors generate successors Expand the node to P 0 P p − 1 generate successors P 1 Figure 11.16 A general schematic for parallel best-first search using a centralized strategy. The locking operation is used here to serialize queue access by various processors.
Exchange best nodes Local list Local list Local list Exchange best nodes Exchange best nodes P 0 P p − 1 P 1 Figure 11.17 A message-passing implementation of parallel best-first search using the ring com- munication strategy.
blackboard Exchange Exchange best nodes best nodes Exchange Local list best nodes Local list Local list P 0 P p − 1 P 1 Figure 11.18 An implementation of parallel best-first search using the blackboard communication strategy.
Start node S Start node S R1 1 2 10 R2 L1 3 11 4 R3 R4 L2 5 12 R5 L3 6 9 13 L4 Goal node G Goal node G 7 8 Total number of nodes generated by Total number of nodes generated by sequential formulation = 13 two-processor formulation of DFS = 9 (a) (b) Figure 11.19 The difference in number of nodes searched by sequential and parallel formulations of DFS. For this example, parallel DFS reaches a goal node after searching fewer nodes than se- quential DFS.
Start node S Start node S 1 R1 2 R2 L1 3 4 R3 R4 L2 5 R5 L3 6 R6 L4 7 R7 L5 Goal node G Goal node G Total number of nodes generated by Total number of nodes generated by sequential DFS = 7 two-processor formulation of DFS = 12 (b) (a) Figure 11.20 A parallel DFS formulation that searches more nodes than its sequential counterpart.
Initially, target = x 000 target = x+5 After Increment, 110 111 x x+2 2 3 011 010 100 000 x+2 x+1 x x+3 010 000 100 110 1 1 1 2 100 101 x x+1 x+2 x+3 x+4 1 1 1 1 1 000 001 000 001 010 011 100 101 110 111 Message combining and a sample implementation on an eight-processor hypercube. Figure 11.21
Subtask generator (manager) Subtasks Search tree Work request Processors Figure 11.22 The single-level work-distribution scheme for tree search.
Recommend
More recommend