Overview of Multivariate Optimization Topics Multivariate Optimization Overview • Problem definition • The “unconstrained optimization” problem is a generalization of the line search problem • Algorithms • Find a vector a such that – Cyclic coordinate method a ∗ = argmin – Steepest descent f ( a ) a – Conjugate gradient algorithms • Note that the are no constraints on a – PARTAN – Newton’s method • Example: Find the vector of coefficients ( w ∈ R p × 1 ) that minimize the average absolute error of a linear model – Levenberg-Marquardt • Akin to a blind person trying to find their way to the bottom of a • Concise, subjective summary valley in a multidimensional landscape • We want to reach the bottom with the minimum number of “cane taps” • Also vaguely similar to taking core samples for oil prospecting J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 1 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 2 Example 1: Optimization Problem Example 1: Optimization Problem 5 5 a 2 a 2 0 0 −5 −5 −5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5 a 1 a 1 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 3 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 4
Example 1: Optimization Problem Example 1: Optimization Problem J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 5 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 6 Example 1: Optimization Problem Example 1: MATLAB Code function [] = OptimizationProblem (); % ============================================================================== % User - Specified Parameters % ============================================================================== x = -5:0 .05 :5; y = -5:0 .05 :5; % ============================================================================== % Evaluate the Function % ============================================================================== [X,Y] = meshgrid(x,y); [Z,G] = OptFn(X,Y); functionName = ’OptimizationProblem ’; fileIdentifier = fopen ([ functionName ’.tex ’],’w’); % ============================================================================== % Contour Map % ============================================================================== figure; FigureSet (2,’Slides ’); contour(x,y,Z ,50); xlabel(’a_1 ’); ylabel(’a_2 ’); zoom on; AxisSet (8); fileName = sprintf(’%s-%s’,functionName ,’Contour ’); J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 7 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 8
print(fileName ,’-depsc ’); AxisSet (8); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ newslide\n’); fileName = sprintf(’%s-%s’,functionName ,’Quiver ’); fprintf(fileIdentifier ,’\\ stepcounter{exc }\n’); print(fileName ,’-depsc ’); fprintf(fileIdentifier ,’\\ slideheading{Example \\ arabic{exc }: Optimization Problem }\n’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ newslide\n’); fprintf(fileIdentifier ,’\\ includegraphics [ scale =1]{ Matlab /%s}\n’,fileName ); fprintf(fileIdentifier ,’\\ slideheading{ Example \\ arabic{exc }: Optimization Problem }\n’); fprintf(fileIdentifier ,’\n’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ includegraphics [ scale =1]{ Matlab /%s}\n’,fileName ); % ============================================================================== fprintf(fileIdentifier ,’\n’); % Quiver Map % ============================================================================== % ============================================================================== figure; % 3D Maps FigureSet (1,’Slides ’); % ============================================================================== axis ([-5 5 -5 5]); figure; contour(x,y,Z ,50); set(gcf ,’Renderer ’,’zbuffer ’); h = get(gca ,’Children ’); FigureSet (1,’Slides ’); set(h,’LineWidth ’,0.2); h = surf(x,y,Z); hold on; set(h,’LineStyle ’,’None ’); xCoarse = -5:0.5 :5; xlabel(’a_1 ’); yCoarse = -5:0.5 :5; ylabel(’a_2 ’); [X,Y] = meshgrid(xCoarse ,yCoarse ); shading interp; [ZCoarse ,GCoarse] = OptFn(X,Y); grid on; nr = size(xCoarse ,1); AxisSet (8); dzx = GCoarse( 1:nr ,1:nr); hl = light(’Position ’ ,[0 ,0 ,30]); dzy = GCoarse(nr + (1: nr),1:nr); set(hl ,’Style ’,’Local ’); quiver(xCoarse ,yCoarse ,dzx ,dzy ); set(h,’BackFaceLighting ’,’unlit ’) hold off; material dull xlabel(’a_1 ’); ylabel(’a_2 ’); for c1 =1:3 zoom on; switch c1 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 9 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 10 Global Optimization? case 1, view (45 ,10); case 2, view ( -55 ,22); case 3, view ( -131 ,10); • In general, all optimization algorithms find a local minimum in as otherwise , error(’Not implemented. ’); end few steps as possible fileName = sprintf(’%s-%s%d’,functionName ,’Surface ’,c1); • There are also “global” optimization algorithms based on ideas print(fileName ,’-depsc ’); fprintf(fileIdentifier ,’%%============================================================================= such as fprintf(fileIdentifier ,’\\ newslide\n’); fprintf(fileIdentifier ,’\\ slideheading{ Example \\ arabic{exc }: Optimization Problem }\n’); – Evolutionary computing fprintf(fileIdentifier ,’%%============================================================================= fprintf(fileIdentifier ,’\\ includegraphics [ scale =1]{ Matlab /%s}\n’,fileName ); – Genetic algorithms fprintf(fileIdentifier ,’\n’); end – Simulated annealing % ============================================================================== • None of these guarantee convergence in a finite number of % List the MATLAB Code % ============================================================================== iterations fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\\ newslide \n’); • All require a lot of computation fprintf(fileIdentifier ,’\\ slideheading{ Example \\ arabic{exc }: MATLAB Code }\n’); fprintf(fileIdentifier ,’%%==============================================================================\ n’ fprintf(fileIdentifier ,’\t \\ matlabcode{Matlab /% s.m }\n’,functionName ); fclose( fileIdentifier ); J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 11 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 12
Optimization Comments Optimization Algorithm Outline • Ideally, when we construct models we should favor those which • The basic steps of these algorithms is as follows can be optimized with few shallow local minima and reasonable 1. Pick a starting vector a computation 2. Find the direction of descent, d • Graphically you can think of the function to be minimized as the 3. Move in that direction until a minimum is found: elevation in a complicated high-dimensional landscape α ∗ := argmin f ( a + α d ) • The problem is to find the lowest point α a := a + α ∗ d • The most common approach is to go downhill • The gradient points in the most “uphill” direction 4. Loop to 2 until convergence • The steepest downhill direction is the opposite of the gradient • Most of the theory of these algorithms is based on quadratic surfaces • Most optimization algorithms use a line search algorithm • Near local minima, this is a good approximation • The methods mostly differ only in the way that the “direction of descent” is generated • Note that the functions should (must) have continuous gradients (almost) everywhere J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 13 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 14 Cyclic Coordinate Method Example 2: Cyclic Coordinate Method 1. For i = 1 to p , 5 a i := argmin f ([ a 1 , a 2 , . . . , a i − 1 , α, a i +1 , . . . , a p ]) 4 α 3 2. Loop to 1 until convergence 2 + Simple to implement 1 + Each line search can be performed semi-globally to avoid shallow local minima Y 0 + Can be used with nominal variables −1 + f ( a ) can be discontinuous −2 + No gradient required −3 − Very slow compared to gradient-based optimization algorithms −4 − Usually only practical when the number of parameters, p , is small −5 −5 0 5 • There are modified versions with faster convergence X J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 15 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 16
Recommend
More recommend