bellman s curse of dimensionality
play

Bellmans curse of dimensionality n n-dimensional state space n Number - PowerPoint PPT Presentation

Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional] Betts, Practical Methods for Optimal


  1. Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 – 11 [optional] Betts, Practical Methods for Optimal Control Using Nonlinear Programming

  2. Bellman’s curse of dimensionality n n-dimensional state space n Number of states grows exponentially in n (assuming some fixed number of discretization levels per coordinate) n In practice n Discretization is considered only computationally feasible up to 5 or 6 dimensional state spaces even when using n Variable resolution discretization n Highly optimized implementations

  3. This Lecture: Nonlinear Optimization for Optimal Control Goal: find a sequence of control inputs (and corresponding sequence n of states) that solves: Generally hard to do. We will cover methods that allow to find a n local minimum of this optimization problem. Note: iteratively applying LQR is one way to solve this problem if n there were no constraints on the control inputs and state. In principle (though not in our examples), u could be parameters of a n control policy rather than the raw control inputs.

  4. Outline n Unconstrained minimization n Gradient Descent n Newton’s Method n Equality constrained minimization n Inequality and equality constrained minimization

  5. Unconstrained Minimization If x* satisfies: n then x* is a local minimum of f. In simple cases we can directly solve the system of n equations given by (2) to find n candidate local minima, and then verify (3) for these candidates. In general however, solving (2) is a difficult problem. Going forward we will n consider this more general setting and cover numerical solution methods for (1).

  6. Steepest Descent n Idea: n Start somewhere n Repeat: Take a small step in the steepest descent direction Local Figure source: Mathworks

  7. Steep Descent n Another example, visualized with contours: Figure source: yihui.name

  8. Steepest Descent Algorithm 1. Initialize x 2. Repeat 1. Determine the steepest descent direction ¢ x 2. Line search. Choose a step size t > 0. 3. Update. x := x + t ¢ x. 3. Until stopping criterion is satisfied

  9. What is the Steepest Descent Direction? à Steepest Descent = Gradient Descent

  10. Stepsize Selection: Exact Line Search n Used when the cost of solving the minimization problem with one variable is low compared to the cost of computing the search direction itself.

  11. Stepsize Selection: Backtracking Line Search n Inexact: step length is chose to approximately minimize f along the ray {x + t ¢ x | t ¸ 0}

  12. Stepsize Selection: Backtracking Line Search Figure source: Boyd and Vandenberghe

  13. Steepest Descent = Gradient Descent Figure source: Boyd and Vandenberghe

  14. Gradient Descent: Example 1 Figure source: Boyd and Vandenberghe

  15. Gradient Descent: Example 2 Figure source: Boyd and Vandenberghe

  16. Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe

  17. Gradient Descent Convergence Condition number = 10 Condition number = 1 For quadratic function, convergence speed depends on ratio of highest n second derivative over lowest second derivative (“condition number”) In high dimensions, almost guaranteed to have a high (=bad) condition n number Rescaling coordinates (as could happen by simply expressing quantities in n different measurement units) results in a different condition number

  18. Outline n Unconstrained minimization n Gradient Descent n Newton’s Method n Equality constrained minimization n Inequality and equality constrained minimization

  19. Newton’s Method (assume f convex for now) n 2 nd order Taylor Approximation rather than 1 st order: assuming , the minimum of the 2 nd order approximation is achieved at: Figure source: Boyd and Vandenberghe

  20. Newton’s Method Figure source: Boyd and Vandenberghe

  21. Affine Invariance n Consider the coordinate transformation y = A -1 x (x = Ay) n If running Newton’s method starting from x (0) on f(x) results in x (0) , x (1) , x (2) , … n Then running Newton’s method starting from y (0) = A -1 x (0) on g(y) = f(Ay), will result in the sequence y (0) = A -1 x (0) , y (1) = A -1 x (1) , y (2) = A -1 x (2) , … n Exercise: try to prove this!

  22. Affine Invariance --- Proof

  23. Newton’s method when f not convex (i.e. not ) n Example 1: n Example 2: 2 nd order approximation 2 nd order approximation à ended up at max rather than à ended up at inflection point min ! rather than min !

  24. Newton’s method when f not convex (i.e. not ) Issue: now ¢ x nt does not lead to the local minimum of the quadratic n approximation --- it simply leads to the point where the gradient of the quadratic approximation is zero, this could be a maximum or a saddle point Possible fixes, let be the eigenvalue decomposition. n n Fix 1: n Fix 2: n Fix 3: (“proximal method”) n Fix 4: In my experience Fix 3 works best.

  25. Example 1 gradient descent with Newton’s method with backtracking line search Figure source: Boyd and Vandenberghe

  26. Example 2 gradient descent Newton’s method Figure source: Boyd and Vandenberghe

  27. Larger Version of Example 2

  28. Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe

  29. Example 3 Gradient descent n Newton’s method (converges in one step if f convex quadratic) n

  30. Quasi-Newton Methods n Quasi-Newton methods use an approximation of the Hessian n Example 1: Only compute diagonal entries of Hessian, set others equal to zero. Note this also simplfies computations done with the Hessian. n Example 2: natural gradient --- see next slide

  31. Natural Gradient Consider a standard maximum likelihood problem: n Gradient: n Hessian: n Natural gradient: n only keeps the 2 nd term in the Hessian. Benefits: (1) faster to compute (only gradients needed); (2) guaranteed to be negative definite; (3) found to be superior in some experiments; (4) invariant to re-parametrization

  32. Natural Gradient n Property: Natural gradient is invariant to parameterization of the family of probability distributions p( x ; µ ) n Hence the name. n Note this property is stronger than the property of Newton’s method, which is invariant to affine re- parameterizations only. n Exercise: Try to prove this property!

  33. Natural Gradient Invariant to Reparametrization --- Proof n Natural gradient for parametrization with µ : n Let Á = f( µ ), and let i.e., à the natural gradient direction is the same independent of the (invertible, but otherwise not constrained) reparametrization f

  34. Outline n Unconstrained minimization n Gradient Descent n Newton’s Method n Equality constrained minimization n Inequality and equality constrained minimization

  35. Equality Constrained Minimization n Problem to be solved: n We will cover three solution methods: n Elimination n Newton’s method n Infeasible start Newton method

  36. Method 1: Elimination From linear algebra we know that there exist a matrix F (in fact infinitely many) n such that: can be any solution to Ax = b F spans the nullspace of A A way to find an F: compute SVD of A, A = U S V’, for A having k nonzero singular values, set F = U(:, k+1:end) So we can solve the equality constrained minimization problem by solving an n unconstrained minimization problem over a new variable z : Potential cons: (i) need to first find a solution to Ax=b, (ii) need to find F, (iii) n elimination might destroy sparsity in original problem structure

  37. Methods 2 and 3 Require Us to First Understand the Optimality Condition n Recall the problem to be solved:

  38. Method 2: Newton’s Method n Problem to be solved: n n Assume x is feasible, i.e., satisfies Ax = b, now use 2 nd order approximation of f: n à Optimality condition for 2 nd order approximation:

  39. Method 2: Newton’s Method With Newton step obtained by solving a linear system of equations: Feasible descent method:

  40. Method 3: Infeasible Start Newton Method Problem to be solved: n n Use 1 st order approximation of the optimality conditions at current x: n

  41. Methods 2 and 3 Require Us to First Understand the Optimality Condition n Recall the problem to be solved:

  42. Optimal Control n We can now solve: n And often one can efficiently solve by iterating over (i) linearizing the constraints, and (ii) solving the resulting problem.

  43. Optimal Control: A Complete Algorithm n Given: n For k=0, 1, 2, …, T n Solve n Execute u k n Observe resulting state, à = an instantiation of Model Predictive Control. à Initialization with solution from iteration k-1 can make solver very fast (and would be done most conveniently with infeasible start Newton method)

  44. Outline n Unconstrained minimization n Equality constrained minimization n Inequality and equality constrained minimization

  45. Equality and Inequality Constrained Minimization n Recall the problem to be solved:

Recommend


More recommend