ECS231 Least-squares problems (Introduction to Randomized - PowerPoint PPT Presentation

ECS231 Least-squares problems (Introduction to Randomized Algorithms) May 21, 2019 1 / 12

Outline 1. linear least squares – review 2. Solving LS by sampling 3. Solving LS by randomized preconditioning 4. Gradient-based optimization – review 5. Solving LS by gradient-descent 6. Solving LS by stochastic gradient-descent 2 / 12

Review: Linear least squares ◮ Linear least squares problem min x � Ax − b � 2 ◮ Normal equation A T Ax = A T b ◮ Optimal solution x = A + b 3 / 12

Solving LS by sampling ◮ MATLAB demo code: lsbysampling.m >> ... >> A = rand(m,n); b = rand(m,1); >> sampled_rows = find( rand(m,1) < 10*n*log(n)/m ); >> A1 = A(sampled_rows,:); >> b1 = b(sampled_rows); >> x1 = A1\b1; >> ... ◮ Further reading: Avron et al, SIAM J. Sci. Comput., 32:1217-1236, 2010 4 / 12

Solving LS by randomized preconditioning ◮ Linear least squares problem x � A T x − b � 2 min ◮ Normal equation ( AA T ) x = Ab ◮ If we can find a P such that P − 1 A is well-conditioned, then it yields x = ( AA T ) − 1 Ab = P − T · ( P − 1 A · ( P − 1 A ) T ) − 1 · P − 1 A · b 5 / 12

Solving LS by randomized preconditioning ◮ MATLAB demo code: lsbyrandprecond.m >> ... >> ell = m+4; >> G = randn(n,ell); >> S = A*G; % sketching of A >> [Q,R,E]=qr(S’); % QR w. col. pivoting S’*E = Q*R >> P = E*R(1:m,1:m)’; % preconditioner P >> B = P\A; >> PAcondnum = cond(B) % the condition number >> ... ◮ Further reading: Coakley et al, SIAM J. Sci. Comput., 33:849-868, 2011 6 / 12

Review: Gradient-based optimization ◮ Optimization problem x ∗ = argmin f ( x ) x ◮ Gradient: ∇ x f ( x ) The first-order approximation f ( x + ∆x ) = f ( x ) + ∆x T ∇ x f ( x ) + O ( � ∆x � 2 2 ) ∂α f ( x + αu ) = u T ∇ x f ( x ) ∂ Directional derivative: ◮ To min f ( x ) , we would like to find the direction u in which f decreases the fastest. Using the directional derivative, f ( x + αu ) = f ( x ) + αu T ∇ x f ( x ) + O ( α 2 ) Note that u,u T u =1 u T ∇ x f ( x ) = min u,u T u =1 � u � 2 �∇ x f ( x ) � 2 cos θ min = −�∇ x f ( x ) � 2 when u is the opposite of ∇ x f ( x ) . Therefore, the steepest descent direction u = −∇ x f ( x ) . 7 / 12

Review: Gradient-based optimization, cont’d ◮ The method of steepest descent x ′ = x − ǫ · ∇ x f ( x ) , where the “learning rate” ǫ can be chosen as follows: 1. ǫ = small const. 2. min ǫ f ( x − ǫ · ∇ x f ( x )) 3. evaluate f ( x − ǫ ∇ x f ( x )) for several different values of ǫ and choose the one that results in the smallest objective function value. 8 / 12

Solving LS by gradient-descent ◮ Minimization problem 1 2 � Ax − b � 2 min x f ( x ) = min 2 x ◮ Gradient: ∇ x f ( x ) = A T Ax − A T b ◮ The method of gradient descent: ◮ set the stepsize ǫ and tolerance δ to small positive numbers. ◮ while � A T Ax − A T b � 2 > δ do x ← x − ǫ · ( A T Ax − A T b ) ◮ end while 9 / 12

Solving LS by gradient-descent MATLAB demo code: lsbygd.m >> ... >> r = A’*(A*x - b); >> xp = x - tau*r; >> res(k) = norm(r); >> if res(k) <= tol, ... end >> ... >> x = xp; >> ... 10 / 12

Solve LS by stochastic gradient descent ◮ Minimization problem: n 1 1 2 � Ax − b � 2 � x ∗ = argmin 2 = argmin f i ( x ) = argmin E f i ( x ) n x x x i =1 2 ( � a i , x � − b i ) 2 and a 1 , a 2 ... are the rows of A . where f i ( x ) = n ◮ Gradient: ∇ x f i ( x ) = n ( � a i , x � − b i ) a i . ◮ The stochastic gradient descent (SGD) method solves the LS problem by iterative moving in the gradient direction of a selected function f i k : x k +1 ← x k − γ · ∇ f i k ( x k ) where index i k is selected randomly in the k th iteration: ◮ uniformally at random, or ◮ weighted sampling 1 1 D. Needell et al, Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm, Math. Program. Ser. A (2016) 155:549-573. 11 / 12

Solve LS by stochastic gradient descent MATLAB demo code: lsbysgd.m >> ... >> s = rand; >> i = sum(s >= cumsum([0, prob])); % with probability prob(i) >> dx = n*(A(i,:)*x0 - b(i))*A(i,:); >> x = x0 - (gamma/(n*prob(i)))*dx’; % weighted SGD >> ... 12 / 12

ECS231 Least-squares problems (Introduction to Randomized - PowerPoint PPT Presentation

ECS231 Least-squares problems (Introduction to Randomized Algorithms) May 21, 2019 1 / 12 Outline 1. linear least squares review 2. Solving LS by sampling 3. Solving LS by randomized preconditioning 4. Gradient-based optimization

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Geometry of Least Squares 2 Least squares from the

CS475 / CS675 Lecture 10: June 2, 2016 Least Squares Problems Reading: [TB] Chapt 11

Squares of function spaces and function spaces on squares Miko laj Krupski University of

Solving Regularized Total Least Squares Problems Based on Eigensolvers Heinrich Voss

Edge universality in interacting 2 d topological insulators Marcello Porta Joint with: G.

HOW TO THINK OF QUANTUM MARKOV MODELS FROM AN ENGINEERING PERSPECTIVE John Gough (Aberystwyth)

Semi-completeness a uniform algebraic approach to cut elimination Hiroakira Ono Japan

Parallel Dialogue Games and Hypersequents for Intermediate Logics Chris Ferm uller Theory and

Adiabatic theorems in quantum statistical mechanics and Landauer principle Vojkan Jaksic McGill

Efficient Malware Detection using Model-Checking Tayssir Touili LIPN, CNRS & Univ. Paris 13

Through Many-Valent Semantics Carolina Blasio IFCH/UNICAMP PhDs in Logic May 3 rd , BOCHUM

Using sources ANU Academic Skills Workshop coverage Why use academic sources in your work?

ECS231 Least-squares problems (Introduction to Randomized - PowerPoint PPT Presentation

ECS231 Least-squares problems (Introduction to Randomized Algorithms) May 21, 2019 1 / 12 Outline 1. linear least squares review 2. Solving LS by sampling 3. Solving LS by randomized preconditioning 4. Gradient-based optimization

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Geometry of Least Squares 2 Least squares from the

CS475 / CS675 Lecture 10: June 2, 2016 Least Squares Problems Reading: [TB] Chapt 11

Squares of function spaces and function spaces on squares Miko laj Krupski University of

Solving Regularized Total Least Squares Problems Based on Eigensolvers Heinrich Voss

Edge universality in interacting 2 d topological insulators Marcello Porta Joint with: G.

HOW TO THINK OF QUANTUM MARKOV MODELS FROM AN ENGINEERING PERSPECTIVE John Gough (Aberystwyth)

Semi-completeness a uniform algebraic approach to cut elimination Hiroakira Ono Japan

Parallel Dialogue Games and Hypersequents for Intermediate Logics Chris Ferm uller Theory and

Adiabatic theorems in quantum statistical mechanics and Landauer principle Vojkan Jaksic McGill

Efficient Malware Detection using Model-Checking Tayssir Touili LIPN, CNRS &amp; Univ. Paris 13

Through Many-Valent Semantics Carolina Blasio IFCH/UNICAMP PhDs in Logic May 3 rd , BOCHUM

Using sources ANU Academic Skills Workshop coverage Why use academic sources in your work?

Efficient Malware Detection using Model-Checking Tayssir Touili LIPN, CNRS & Univ. Paris 13