Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes
Outline • Shortest paths in graphs • Dynamic programming • Dijkstra’s and A* algorithms • Certainty equivalent control
Graph Weighted Graph • Nodes V := { 1 , . . . , n } • Edges E := { ( i 1 , j 1 ) , . . . , ( i r , j r ) | i 1 , . . . i r , j 1 , . . . , j r ∈ V} • Weights ( i, j ) ∈ E w ij ≥ 0 E = { (3 , 6) , (2 , 3) , . . . } • Undirected if w 36 = 7 , w 23 = 5 , . . . w ij = w ji 6 6 7 7 4 4 3 5 5 3 3 5 5 2 2 3 3 8 8 1 1 3 3 3 6 6 4 1 4 1 Undirected Directed w ij = w ji 1
Applications Graphs model networks (road, social, transportation, etc.) and can be found in numerous applications 2
Shortest path problem Find a path from an initial node to a destination node in a weighted graph, with minimum length (sum of the weights of its edges) Final 6 7 4 Minimum length 11 5 3 5 2 3 8 1 3 Initial 6 4 1 Can we use the DP algorithm to find the shortest path? 3
Discussion • Computing an optimal path in a transition diagram can be seen as computing the shortest path from the nodes at stage to the node at stage of the following weighted graph: 0 h + 1 c 1 c 0 n 1 1 n 0 2 c h − 1 c 0 n h − 1 1 n 0 1 c h c 0 c 1 n h 22 0 23 c h − 1 c 1 c 0 22 22 21 c 0 c h − 1 c 1 12 c h 21 21 artificial node 1 c h − 1 c 0 c 1 11 11 11 artificial stage Stage 0 Stage 1 Stage h − 1 Stage h h + 1 • For graphs with this structure we already know how to use DP to compute shortest paths. • Adjustments are needed for general graphs (e.g. cycles may occur) but DP can still be used to provide the shortest path, as we show next. 4
Dynamic programming formulation Given a weighted graph construct a transition diagram: • stages, states at decision stages and only the destination at the terminal h = n − 1 n stage. • Make , if there is no link from to , and . c k j w ii = 0 i ij = w ij w ij = ∞ 0 0 0 4 4 4 4 ∞ ∞ 3 3 ∞ 3 ∞ ∞ 2 3 3 3 3 3 3 5 3 3 1 1 8 8 4 1 8 8 2 2 2 Destination Initial 1 1 5 5 0 0 1 1 1 5
Dynamic programming solution Apply the DP algorithm to this transition diagram • Costs-to-go at a stage are the costs of the shortest path with hops. In k n − 1 − k particular costs-to-go at the initial stage are the optimal costs for each initial condition. • To find an optimal path follow the policy for a given initial state. • Cost-to-go at stage of a given state is infinite if there is no path from that initial state to 0 the destination. 0 0 0 0 4 3 6 6 2 3 3 3 State x k 5 3 3 1 1 3 3 3 2 4 1 8 7 8 8 1 Destination Initial 0 1 2 3 Stage k The implementation can be made more efficient and one does not need to first construct the transition diagram. Moreover, one can stop when the costs-to-go remain unchanged. 6
Example Another example for an undirected graph 0 0 0 0 0 6 6 4 4 4 4 4 7 4 5 7 7 7 7 ∞ 5 3 4 5 State x k 7 7 7 7 7 2 3 3 8 10 10 10 ∞ 1 12 3 2 11 11 13 15 ∞ 6 4 1 1 0 1 2 3 4 5 Stage k 7
Shortest paths in road networks What is the shortest distance from Bucharest to Lugoj? Oradea 71 Neamt 151 75 Zerind 87 140 Arad Iasi Sibiu Fagaras 99 92 118 80 Rimnicu Vilcea Vaslui Timisoara 211 97 111 Lugoj Pitesti 142 70 146 98 Mehadia Hirsova 85 75 101 Urziceni 138 Dobreta 120 86 Bucharest 90 Craiova Eforie Giurgiu Rode map of Romania 8
Shortest paths in road networks 504 km (Route: Bucharest, Pitesti, Craiova, Dobreta, Mehadia, and Lugoj) 9
Robot path planning 30 What is the shortest path for a robot to go from point A to B? 25 20 15 B 10 5 A 0 -5 5 10 15 20 25 30 35 40 45 10
Assumptions • It takes distance unit to move horizontally or vertically between adjacent 1 √ nodes and units to move diagonally. 2 • Distances to obstacle nodes are infinite. • Distance between two diagonally adjacent nodes, adjacent to the same obstacle node is infinite. 1 1 √ 2 ∞ 1 1 ∞ √ ∞ 2 1 ∞ √ 2 ∞ 1 1 ∞ √ 2 ∞ 1 ∞ 11
Robot path planning What is the shortest path for a robot to go from point A to B? B A 12
Robot path planning Simpler example to show the costs-to-go 11 11 10.83 10.41 10.00 11.00 10.00 9.00 8.00 7.00 7.41 7.83 8.83 9.83 10.24 10 10 10.41 9.41 9.00 6.00 6.41 7.41 8.41 8.83 9.24 9 9 10.00 9.00 8.00 7.00 6.00 5.00 4.00 5.00 6.00 7.41 7.83 8.24 8 8 9.83 9.41 9.00 3.00 6.41 6.83 7.83 7 7 8.83 8.41 8.00 2.00 2.41 3.41 4.41 5.41 6.41 7.41 6 6 7.83 7.41 7.00 1.00 1.41 5.41 8.41 5 5 7.41 6.41 6.00 0.00 1.00 6.41 9.41 4 4 7.00 6.00 5.00 4.00 3.00 2.00 1.00 1.41 5.41 8.41 3 3 7.41 6.41 5.41 4.41 3.41 2.41 2.00 2.41 3.41 4.41 5.41 6.41 7.41 2 2 7.83 6.83 5.83 4.83 3.83 3.41 3.00 3.41 3.83 4.83 5.83 6.83 7.83 1 1 2 4 6 8 10 12 14 2 4 6 8 10 12 14 Side remark: the cost-to-go can be view as a Lyapunov function and the policy can be obtained by following the direction of maximum decrease of this function. 13
Time-varying graphs How to design a shortest path from A to B when the obstacles are moving? Initial position Final t = 0 t = 1 t = 2 t = T 14
Time-varying graphs 1. Consider the set of static graphs for each time step t = 0 t = 1 15 t = 2 t = T
Time-varying graphs 2. Build a time-invariant graph in 3D √ 1 2 √ 2 Example √ √ 2 1 2 √ ∞ ∞ 2 t = 0 t = 1 t = 2 3. Compute shortest path for 3D graph Initial node: initial node at time t = 0 Final node: final node at time t = T 16 t = T
Outline • Shortest paths in graphs • Dynamic programming • Dijkstra’s and A* algorithms • Certainty equivalent control
Discussion DP can be quite inefficient when computing an optimal path in enough. 4 5 n − 1 n > 2 > 2 > 2 > 2 > 2 > 2 > 2 > 2 1 1 2 1 3 destination initial • Figure example: DP searches the full space - not necessary to compute the optimal path. • For shortest path problems in graphs, there are many alternative algorithms. We describe next the Dijkstra’s and the A* algorithms. 17
Dijkstra’s algorithm Main ideas Iteratively generate shorter paths from the origin to every node. • • Updates list of nodes (wavefront) which can be explored next. • New nodes are added to the wavefront based on the cost: neighbors of node with the smallest distance to the origin. source: wikipedia 18
Dijkstra’s algorithm Initialization • for , , and OPEN initial node - final node = { p } i ∈ V − { p } d p = 0 d i = ∞ t p − Steps 1. Remove a node from OPEN with the minimum estimate . If stop, otherwise d i i i = t execute step 2 for every node for which there is a path (arrow) from to . j i j 2. If : set , set , place in OPEN if it is not there β ( j ) = i j d i + w ij < d j d j = d i + w ij already. Otherwise do not update , . β ( j ) d j 3. After executing Step 2 for all the nodes corresponding to out-neighbors of , go to step I. j i Optimal path • To keep track of the shortest paths if suffices to save for every node the next node β ( i ) i along the optimal path (discovered so far) leading to the initial node. • The optimal path is then given by for , i L − 1 = β ( t ) . . . , i 0 = β ( i 1 ) ( i 0 , i 1 , . . . , i L ) i L = t or equivalently , , where is such that . i ` − 1 = β ( i ` ) i 0 = p ` ∈ { 1 , 2 , . . . , L } L • If OPEN is empty at a given step of the algorithm then there is no path to the destination. 19
Example I Dijkstra’s algorithm requires only three iterations for this example Pairs ( i, d i ) , i ∈ OPEN Iteration (1 , 0) 0 4 5 n − 1 n β (2) = 1 (2 , 1) 1 > 2 > 2 > 2 > 2 > 2 > 2 > 2 > 2 + other pairs pertaining to other neigh. of node 1 (3 , 2) 2 1 1 β (3) = 2 2 1 3 + other pairs pertaining to destination initial other neigh. of nodes & 2 1 Destination/final node 3 removed from OPEN - terminate 20
Example II 6 7 4 5 3 5 2 3 8 1 3 Pairs ( i, d i ) , i ∈ OPEN Iteration 6 4 1 (1 , 0) 0 (2 , 1) , (3 , 8) , (4 , 6) 1 β (2) = 1 β (3) = 1 β (4) = 1 β (4) = 2 β (3) = 2 (3 , 6) , (4 , 4) 2 β (5) = 4 (3 , 6) , (5 , 7) 3 (5 , 7) , (6 , 13) β (6) = 3 4 β (6) = 5 (6 , 11) 5 Optimal path (from end to start) (6 , β (6) , β ( β (6)) , . . . , 1) = (6 , 5 , 4 , 2 , 1) 21
Shortest paths in road networks What is the shortest distance from Bucharest to Lugoj? Oradea 71 Neamt 151 75 Zerind 87 140 Arad Iasi Sibiu Fagaras 99 92 118 80 Rimnicu Vilcea Vaslui Timisoara 211 97 111 Lugoj Pitesti 142 70 146 98 Mehadia Hirsova 85 75 101 Urziceni 138 Dobreta 120 86 Bucharest 90 Craiova Eforie Giurgiu Rode map of Romania 22
Recommend
More recommend