Hilberts 13th Problem Great Theorem; Shame about the Algorithm Bill - PowerPoint PPT Presentation

Hilbert’s 13th Problem Great Theorem; Shame about the Algorithm Bill Moran

Structure of Talk Solving Polynomial Equations Hilbert’s 13th Problem ‘Kolmogorov-Arnold Theorem Neural Networks

Quadratic Equations ax 2 + bx + c = 0 √ b 2 − 4 ac x = − b ± 2 a How do we do it? Eliminate the x term by replacing x by y = x + b 2 a ay 2 + c − b 2 2 a = 0

What about Cubics? ax 3 + bx 2 + cx + d = 0 (1) Eliminate x 2 term — replace x by y = x + b 3 a : y 3 + c ′ y + d ′ = 0 Write y = u + v u 3 + v 3 + (3 uv + c ′ )( u + v ) + d ′ = 0 Set 3 uv + c ′ = 0 � c ′ � 3 + d ′ = 0 u 3 − 3 u Quadratic in u 3 — solve quadratic and take cube roots This gives u , then get v , then y and finally x . del Ferro, Tartaglia, Cardano, 1530

Let’s be a little more adventurous ax 4 + bx 3 + cx 2 + dx + e = 0 Similar trick to cubic case to remove cubic term: y 4 + py 2 + qy + r = 0 Complete the square: 2) 2 = p 2 ( y 2 + p 4 − qy − r Introduce new variable z : ( y 2 + p 2 + z ) 2 — this is: ( y 2 + p 2 ) 2 + pz + 2 y 2 z + z 2 Then z 2 + zp + p 2 ( y 2 + p 2 + z ) 2 = 2 zy 2 − qy + � � 4 − r

Quartic Continued Choose z to make RHS a perfect square — so discriminant 0 : z 2 + zp + p 2 q 2 = 8 z � � 4 − r Solve this cubic for z then we have A 2 = B 2 where A = ( y 2 + p 2 + z ) and B 2 = 2 zy 2 − qy + z 2 + zp + p 2 � � 4 − r A = ± B gives two quadratics in y Lodovico de Ferrari, Cardano

Quintic ax 5 + bx 4 + cx 3 + dx 2 + ex + f = 0 (2) Tschirnhaus transformations: y = g ( x ) h ( x ) g and h polynomials h non-vanishing at roots of quintic Can use Tschirnhaus transformations to reduce (2) to the Bring-Jerrard form: x 5 − x + q = 0 (3) q is some rational function of the coefficients in (2) Can obtain solutions of (2) as rational functions of roots of (3) (Hermite) Elliptic modular functions involving q are used to solve

Lest you think this is useless nonsense!

Sextic ax 6 + bx 5 + cx 4 + dx 3 + ex 2 + fx + g = 0 (4) Tschirnhaus transformations: x 6 + px 2 + qx + 1 = 0 . (5) Its solution is φ ( p, q ) . Solution uses derivatives of generalized hypergeometric functions wrt their parameters called Kamp´ e de F´ eriet functions

Septic ax 7 + bx 6 + cx 5 + dx 4 + ex 3 + fx 2 + gx + h = 0 (6) Tschirnhaus transformations: x 7 + px 3 + qx 2 + rx + 1 = 0 . (7) Its solution is φ ( p, q, r ) . Hilbert: Can we express φ ( p, q, r ) in terms of functions of 2 variables? Measure of complexity of problem

What this means A function f ( x 1 , x 2 , . . . , x n ) of n variables is a superposition of functions g k ( y k, 1 , y k, 2 , . . . , y k,r k ) , ( k = 0 , 1 , . . . , m ) if each y k,i is one of the variables x j and there is a function h so that f ( x 1 , x 2 , . . . , x n ) = h ( g 1 ( y 1 , 1 , y 1 , 2 , . . . , y 1 ,r 1 ) , g 2 ( y 2 , 1 , y 2 , 2 , . . . , y 2 ,r 2 ) , . . . . . . , g m ( y m, 1 , y 1 , 2 , . . . , y m,r m ))

Solutions of Polynomial Equations and Superposition Every solution of a polynomial equation of degree < 7 can be written as a superposition of functions of ≤ 2 variables Every solution of a polynomial equation of degree n can be written as a superposition of functions of ≤ n − 4 variables What about degree 7 ?

Hilbert’s 13th Problem ✓ ✏ A solution of the general equation of degree 7 cannot be represented as a superposition of continuous functions of two variables ✒ ✑ What he meant to say was “algebraic” or “analytic” instead of “continuous” as we shall see!

Why this might be a useful idea Most functions we want to compute are composed of functions of at most two variables ( x, y ) → x + y , ( x, y ) → x.y , x → 1 x , √ x , x → e x , x → log x , x → sin x , etc. ( x, y ) → y To compute gradients of such functions one can use chain rule This approach computes partial derivatives of functions of n variables more efficiently Kim, Nesterov, and Cherkasskii (1984) Given such a computable function of n variables, can compute the function and its gradient in only 4 times as many operations — for large n

Enter Kolmogorov Every continuous function of n -variables on the unit cube is a superposition of continuous functions of 3 variables

Enter Kolmogorov Every continuous function of n -variables on the unit cube is a superposition of continuous functions of 3 variables And Arnold: Every continuous function of n -variables on the unit cube is a superposition of continuous functions of 2 variables (Resolves Hilbert’s 13th Problem)

Sprecher’s Version Sprecher: For each N ≥ 2 there is a Lipschitz function ψ in � � log 2 Lip ( I ) with the following property: for each δ > 0 , log(2 N +2) there is a rational ǫ in interval (0 , δ ) s.t. for all integers n ( 2 ≤ n ≤ N ), and for every continuous function f ( x 1 , x 2 , . . . , x n ) on I n , � n � � � λ p ψ ( x p + ǫq ) + q f ( x 1 , x 2 , . . . , x n ) = g (8) 0 ≤ q ≤ 2 n p =0 where g is continuous and λ > 0 is independent of f .

Idea of Proof — First use discontinuous functions τ k ( x ) τ k ( x ) is k th decimal place of x so x = � ∞ (assume k =1 10 k none ends 00000 . . . , except 0 itself) τ k ( x ) Write ψ r ( x ) = � ∞ 10 kn + r for r = 0 , 1 , . . . , n − 1 k =1 Now n − 1 � ( x 1 , x 2 , . . . , x n ) → χ r ( x r +1 ) = κ ( x 1 , x 2 , . . . , x n ) r =0 is 1 − 1 and onto [0 , 1] but not continuous! Interlacing decimals κ − 1 ( x 1 , x 2 , . . . , x n ) � � Define g ( y ) = f And � n − 1 � � f ( x 1 , x 2 , . . . , x n ) = g ψ r ( x r ) (9) r =0

How does it work? Two ideas: The map ( x 1 , x 2 , . . . , x n ) → � n p =1 ψ r ( x r ) is 1 − 1 — ontoness not needed — but we will need them continuous Then use g to “approximate” values of f on inverse of that map Key issue: a continuous version of 1 − 1 -ness — cannot map I n in a 1 − 1 continuous way into one dimension

Continuous Version Divide I = [0 , 1] into 10 equal intervals and then shrink them slightly from their centres — call these E 1 ( j ) ( j = 0 , 1 , . . . , 9) Repeat this construction 2 n + 1 times ( n is number of variables in function)— call them E k ( j ) Shift the new E k ( j ) ( k > 1 ) along so that every x in I appears in all but at most one E k E 4 E 3 E 2 E 1 E 0

Done in Two Dimensions k ( j ) and consider E 1 k ( j 1 ) × E 2 Take two copies E i k ( j 2 ) For each fixed k can find increasing continuous functions ψ k, 1 and ψ k, 2 on I such that ψ k, 1 ( E (1) k ( j )) + ψ k, 2 ( E (2) k ( k )) are all disjoint for each fixed k — and in 1 -dim Note: enough to do for one k and then shift to cover all of square — cover square in 2 n + 1 shifts)

Refine this Now divide up I into 100 equal pieces, shrink slightly (less this time) from centre to form E 2 ( j ) Can adjust old ψ k, 1 and ψ k, 2 so that in refined version: ψ k, 1 ( E (1) k ( j )) + ψ k, 2 ( E (2) k ( k )) are all disjoint — moreover, adjustment needs only to be small because variation over E k ( j ) s is small! Keep going . . . We end up sequence of compact sets E k on each axis and ψ k,i so that ( x 1 , x 2 ) �→ ψ k, 1 ( x 1 ) + ψ k, 2 ( x 2 ) is 1 − 1 on each member of sequence and E k is most of the interval Union of 5 shifts of E k s cover I 2

Approximate Fix a continuous function f on I 2 � � Approximate by a function of the form g φ k, 1 ( x 1 ) + φ k, 2 ( x 2 ) over most of I 2 Using shifted forms of ψ s we can cover all of square I 2 Given f continuous on I 2 , there exists g 1 continuous on R with � g 1 � ∞ ≤ � f � ∞ s.t. 5 � �� f ( x 1 , x 2 ) − ψ k, 1 ( x 1 ) + ψ ( x 2 ) � < (1 − ǫ ) � f � ∞ g 1 � � k =1 Induct — f 1 = f and 5 � � � f r +1 ( x 1 , x 2 ) = f r ( x 1 , x 2 ) − g r ψ k, 1 ( x 1 ) + ψ ( x 2 ) k =1 Get g r → g and f r → 0 uniformly so 5 � � � f ( x 1 , x 2 ) = g ψ k, 1 ( x 1 ) + ψ ( x 2 ) k =1

But what about differentiable? � n � � � f ( x 1 , x 2 , . . . , x n ) = ψ p,q ( x p ) ( ∗ ) g p =0 0 ≤ q ≤ 2 n (Hilbert) There is an analytic function of three variables that cannot be expressed as a superposition of analytic functions of 2 variables (Konrad, 1954) There is a continuously differentiable function of 3 variables that cannot be expressed as a superposition of continuously differentiable functions of 2 variables (Fridman, 1967) can replace ψ s by Lipschitz functions of exponent 1 (Vitushkin, 1964) There exist analytic functions not expressible by (*) when ψ s are chosen continuously differentiable

Neural Networks A neuron is a node that takes as input a vector ( y 1 , y 2 , . . . , y M ) and outputs a value h ( � M m =1 w m y m − w 0 ) where w m are called weights (Hecht-Nielsen, 1987) Kolmogorov-Arnold can be seen as a 3 -layer neural network Input Hidden Output layer layer layer Input #1 Input #2 Output Input #3 Input #4

Algorithmic Issues Functions involved are highly non-smooth and cannot be made smooth Only get equality in ( ∗ ) by letting iteration go to ∞

Making it Computationally Feasible Can live with ǫ rather than equality provided we know how many iterations for a given level of accuracy Can use Lipschitz functions! (Kurkova “ Kolmogorov’s Theorem is Relevant ” 1991-2) Can specify number of iterations in terms of ǫ

Hilberts 13th Problem Great Theorem; Shame about the Algorithm Bill - PowerPoint PPT Presentation

Hilberts 13th Problem Great Theorem; Shame about the Algorithm Bill Moran Structure of Talk Solving Polynomial Equations Hilberts 13th Problem Kolmogorov-Arnold Theorem Neural Networks Quadratic Equations ax 2 + bx + c = 0 b 2

SHAME, SELF-CRITICISM, 1 Running Head: SHAME, SELF-CRITICISM, Shame, Self-Criticism,

PORN . . . Shhh Vauna Davis, MA, Utah Coalition Against Pornography and Reach 10 SHAME 2 Shame

Apply the Gospel Shame and Honor Shame: A sense of

On Hilbert IVth Problem Marc Troyanov (EPFL) SJTU, June 21, 2019 On Hilbert IVth Abstract

31. Stokes Theorem Stokes theorem is to Greens theorem, for the work done, as the

Around Hilberts 13th Problem Ziqin Feng Miami University February 21, 2012 Ziqin Feng

A new weak Hilbert space Jess Surez de la Fuente, UEx Workshop on Banach spaces and Banach

Hilbert Functions in Algebra and Geometry Alexandra Seceleanu Department of Mathematics

shame: what is it & how to free yourself from it What makes you most uncomfortable talking

PTSD as a Shame Disorder (A Work in Progress) Judith L. Herman, M.D. ISTSS Webinar, October 2014

Psa. 4:2 How long, O men, will you turn my glory into shame? How long will you love delusions and

Hidden No More: Moving from Shame to Wholehearted Living Kate Thieda , MS, LPC, NCC

Engineering Culture Secret Sauce of Great Software Great Software process model Great

Arrows Impossibility Theorem Lecture 12 Arrows Impossibility Theorem Lecture 12, Slide 1

Ch04. Maximum Theorem, Implicit Function Theorem and Envelope Theorem Ping Yu Faculty of

Definable operators on Hilbert spaces The Main Theorem Corollaries Isaac Goldbring UCLA ASL

Corona Road Area Sewer Extension Project JULY 28, 2020 Welcome & Introductions KATE DANIELS

Sara Heger sheger@umn.edu septic.umn.edu H2OandM.com Program began in 1974 Professional

1 Bible Study Lesson: We are accustomed to dividing up our Biblical understanding by chapter and

8. Social Development throughout the Life Span 8.1 Attachment 8.2 Self 8.3 Social

Saturday, November 16, 2019 the Boulevard Club, Toronto Whats in a name? Also, the proposed

2019 CDC Division of Nutrition, Physical Activity and Obesity (DNPAO) National Training F l o r i

Drinking Water: A Groundwater Modeling Study in Newmarket, NH CAW Workshop April 26, 2018

DECENTRALIZED AND ONSITE WASTEWATER MANAGEMENT ISSUES OF SMALL COMMUNITIES IN THE JOURDAN RIVER

Hilberts 13th Problem Great Theorem; Shame about the Algorithm Bill - PowerPoint PPT Presentation

Hilberts 13th Problem Great Theorem; Shame about the Algorithm Bill Moran Structure of Talk Solving Polynomial Equations Hilberts 13th Problem Kolmogorov-Arnold Theorem Neural Networks Quadratic Equations ax 2 + bx + c = 0 b 2

SHAME, SELF-CRITICISM, 1 Running Head: SHAME, SELF-CRITICISM, Shame, Self-Criticism,

PORN . . . Shhh Vauna Davis, MA, Utah Coalition Against Pornography and Reach 10 SHAME 2 Shame

Apply the Gospel Shame and Honor Shame: A sense of

On Hilbert IVth Problem Marc Troyanov (EPFL) SJTU, June 21, 2019 On Hilbert IVth Abstract

31. Stokes Theorem Stokes theorem is to Greens theorem, for the work done, as the

Around Hilberts 13th Problem Ziqin Feng Miami University February 21, 2012 Ziqin Feng

A new weak Hilbert space Jess Surez de la Fuente, UEx Workshop on Banach spaces and Banach

Hilbert Functions in Algebra and Geometry Alexandra Seceleanu Department of Mathematics

shame: what is it &amp; how to free yourself from it What makes you most uncomfortable talking

PTSD as a Shame Disorder (A Work in Progress) Judith L. Herman, M.D. ISTSS Webinar, October 2014

Psa. 4:2 How long, O men, will you turn my glory into shame? How long will you love delusions and

Hidden No More: Moving from Shame to Wholehearted Living Kate Thieda , MS, LPC, NCC

Engineering Culture Secret Sauce of Great Software Great Software process model Great

Arrows Impossibility Theorem Lecture 12 Arrows Impossibility Theorem Lecture 12, Slide 1

Ch04. Maximum Theorem, Implicit Function Theorem and Envelope Theorem Ping Yu Faculty of

Definable operators on Hilbert spaces The Main Theorem Corollaries Isaac Goldbring UCLA ASL

Corona Road Area Sewer Extension Project JULY 28, 2020 Welcome &amp; Introductions KATE DANIELS

Sara Heger sheger@umn.edu septic.umn.edu H2OandM.com Program began in 1974 Professional

1 Bible Study Lesson: We are accustomed to dividing up our Biblical understanding by chapter and

8. Social Development throughout the Life Span 8.1 Attachment 8.2 Self 8.3 Social

Saturday, November 16, 2019 the Boulevard Club, Toronto Whats in a name? Also, the proposed

2019 CDC Division of Nutrition, Physical Activity and Obesity (DNPAO) National Training F l o r i

Drinking Water: A Groundwater Modeling Study in Newmarket, NH CAW Workshop April 26, 2018

DECENTRALIZED AND ONSITE WASTEWATER MANAGEMENT ISSUES OF SMALL COMMUNITIES IN THE JOURDAN RIVER

shame: what is it & how to free yourself from it What makes you most uncomfortable talking

Corona Road Area Sewer Extension Project JULY 28, 2020 Welcome & Introductions KATE DANIELS