learning selection strategies in buchberger s algorithm
play

Learning Selection Strategies in Buchbergers Algorithm Dylan Peifer - PowerPoint PPT Presentation

Learning Selection Strategies in Buchbergers Algorithm Dylan Peifer Cornell University 31 October 2019 Outline The efficiency of Buchbergers algorithm strongly depends on a choice of selection strategy. By phrasing Buchbergers


  1. Learning Selection Strategies in Buchberger’s Algorithm Dylan Peifer Cornell University 31 October 2019

  2. Outline The efficiency of Buchberger’s algorithm strongly depends on a choice of selection strategy. By phrasing Buchberger’s algorithm as a reinforcement learning problem and applying standard reinforcement learning techniques we can learn new selection strategies that can match or beat the existing state-of-the-art. 1. Gr¨ obner Bases and Buchberger’s Algorithm 2. Reinforcement Learning and Policy Gradient 3. Results

  3. 1. Gr¨ obner Bases and Buchberger’s Algorithm

  4. R = K [ x 1 , . . . , x n ] a polynomial ring over some field K I = � f 1 , . . . , f k � ⊆ R an ideal generated by f 1 , . . . , f k ∈ R

  5. R = K [ x 1 , . . . , x n ] a polynomial ring over some field K I = � f 1 , . . . , f k � ⊆ R an ideal generated by f 1 , . . . , f k ∈ R Example R = Q [ x , y ] = { polynomials in x and y with rational coefficients } � x 2 − y 3 , xy 2 + x � I = { a ( x 2 − y 3 ) + b ( xy 2 + x ) : a , b ∈ R } =

  6. R = K [ x 1 , . . . , x n ] a polynomial ring over some field K I = � f 1 , . . . , f k � ⊆ R an ideal generated by f 1 , . . . , f k ∈ R Example R = Q [ x , y ] = { polynomials in x and y with rational coefficients } � x 2 − y 3 , xy 2 + x � I = { a ( x 2 − y 3 ) + b ( xy 2 + x ) : a , b ∈ R } = Question In the above example, is x 5 + x an element of I?

  7. Question Consider the ideal I = � x 2 + x − 2 � in the ring Q [ x ] . Is x 3 + 3 x 2 + 5 x + 4 an element of I?

  8. Question Consider the ideal I = � x 2 + x − 2 � in the ring Q [ x ] . Is x 3 + 3 x 2 + 5 x + 4 an element of I? x + 2 x 2 � x 3 3 x 2 + x − 2 + + 5 x + 4 ( x 3 x 2 − + − 2 x ) 2 x 2 + 7 x + 4 (2 x 2 − + 2 x − 4) 5 x + 8

  9. Question Consider the ideal I = � x 2 + x − 2 � in the ring Q [ x ] . Is x 3 + 3 x 2 + 5 x + 4 an element of I? x + 2 x 2 � x 3 3 x 2 + x − 2 + + 5 x + 4 ( x 3 x 2 − + − 2 x ) 2 x 2 + 7 x + 4 (2 x 2 − + 2 x − 4) 5 x + 8 x 3 + 3 x 2 + 5 x + 4 ( x + 2)( x 2 + x − 2) + (5 x + 8) =

  10. Question Consider the ideal I = � x 2 + x − 2 � in the ring Q [ x ] . Is x 3 + 3 x 2 + 5 x + 4 an element of I? x + 2 x 2 � x 3 3 x 2 + x − 2 + + 5 x + 4 ( x 3 x 2 − + − 2 x ) 2 x 2 + 7 x + 4 (2 x 2 − + 2 x − 4) 5 x + 8 x 3 + 3 x 2 + 5 x + 4 ( x + 2)( x 2 + x − 2) + (5 x + 8) = x 3 + 3 x 2 + 5 x + 4 � x 2 + x − 2 � = ⇒ �∈

  11. Definition Let x α denote an arbitrary monomial where α is the vector of exponents. A monomial order on R = k [ x 1 , . . . , x n ] is a relation > on the monomials of R such that 1. > is a total ordering 2. > is a well-ordering 3. if x α > x β then x γ x α > x γ x β for any x γ (i.e., > respects multiplication).

  12. Definition Let x α denote an arbitrary monomial where α is the vector of exponents. A monomial order on R = k [ x 1 , . . . , x n ] is a relation > on the monomials of R such that 1. > is a total ordering 2. > is a well-ordering 3. if x α > x β then x γ x α > x γ x β for any x γ (i.e., > respects multiplication). Example Lexicographic order (lex) is defined by x α > x β if the leftmost nonzero component of α − β is positive. For example, x > y > z, xy > y 4 , and xz > y 2 .

  13. Divide x 5 + x by the generators x 2 − y 3 and xy 2 + x x 3 q 1 : − xy x 2 y y 2 q 2 : + 1 − x 2 y 3 − xy 2 x 5 + x + x ( x 5 x 3 y 3 ) − − x 3 y 3 + x ( x 3 y 3 x 3 y ) + − − x 3 y + x ( − x 3 y xy 4 ) + − − xy 4 + x ( − xy 4 xy 2 ) − − xy 2 + x ( xy 2 + x ) − 0

  14. Divide x 5 + x by the generators x 2 − y 3 and xy 2 + x x 3 q 1 : − xy x 2 y y 2 q 2 : + 1 − x 2 y 3 − xy 2 x 5 + x + x ( x 5 x 3 y 3 ) − − x 3 y 3 + x ( x 3 y 3 x 3 y ) + − − x 3 y + x ( − x 3 y xy 4 ) + − − xy 4 + x ( − xy 4 xy 2 ) − − xy 2 + x ( xy 2 + x ) − 0 x 5 + x ( x 3 − xy )( x 2 − y 3 ) + ( x 2 y − y 2 + 1)( xy 2 + x ) + 0 =

  15. Divide x 5 + x by the generators x 2 − y 3 and xy 2 + x x 3 q 1 : − xy x 2 y y 2 q 2 : + 1 − x 2 y 3 − xy 2 x 5 + x + x ( x 5 x 3 y 3 ) − − x 3 y 3 + x ( x 3 y 3 x 3 y ) + − − x 3 y + x ( − x 3 y xy 4 ) + − − xy 4 + x ( − xy 4 xy 2 ) − − xy 2 + x ( xy 2 + x ) − 0 x 5 + x ( x 3 − xy )( x 2 − y 3 ) + ( x 2 y − y 2 + 1)( xy 2 + x ) + 0 = x 5 + x � x 2 − y 3 , xy 2 + x � = ⇒ ∈

  16. Definition When F is set of polynomials and dividing h by the f i ∈ F using the division algorithm leads to the remainder r we write h F → r or say h reduces to r.

  17. Definition When F is set of polynomials and dividing h by the f i ∈ F using the division algorithm leads to the remainder r we write h F → r or say h reduces to r. Lemma If h F → 0 then h is in the ideal generated by F.

  18. Definition When F is set of polynomials and dividing h by the f i ∈ F using the division algorithm leads to the remainder r we write h F → r or say h reduces to r. Lemma If h F → 0 then h is in the ideal generated by F. Unfortunately, the converse is false. Example Using the same ideal I = � x 2 − y 3 , xy 2 + x � , note that y 2 ( x 2 − y 3 ) − x ( xy 2 + x ) − x 2 − y 5 = ∈ I However, multivariate division produces the nonzero remainder − y 5 − y 3 .

  19. Definition Given a monomial order, a Gr¨ obner basis G of a nonzero ideal I is a set of generators { g 1 , g 2 , . . . , g s } of I such that any of the following equivalent conditions hold: f G → 0 ⇐ (i) ⇒ f ∈ I f G is unique for all f ∈ R (ii) (iii) � LT( g 1 ) , LT( g 2 ) , . . . , LT( g s ) � = � LT( I ) � where LT( f ) is the leading term of f and � LT( I ) � = � LT( f ) | f ∈ I � is the ideal generated by all leading terms of I.

  20. Definition Given a monomial order, a Gr¨ obner basis G of a nonzero ideal I is a set of generators { g 1 , g 2 , . . . , g s } of I such that any of the following equivalent conditions hold: f G → 0 ⇐ (i) ⇒ f ∈ I f G is unique for all f ∈ R (ii) (iii) � LT( g 1 ) , LT( g 2 ) , . . . , LT( g s ) � = � LT( I ) � where LT( f ) is the leading term of f and � LT( I ) � = � LT( f ) | f ∈ I � is the ideal generated by all leading terms of I. Example Using the same ideal I = � x 2 − y 3 , xy 2 + x � , the set { x 2 − y 3 , xy 2 + x } is not a Gr¨ obner basis of I.

  21. Definition LT( g ) g where x γ is the least common x γ x γ Let S ( f , g ) = LT( f ) f − multiple of the leading monomials of f and g. This is the s-polynomial of f and g, where s stands for subtraction or syzygy.

  22. Definition LT( g ) g where x γ is the least common x γ x γ Let S ( f , g ) = LT( f ) f − multiple of the leading monomials of f and g. This is the s-polynomial of f and g, where s stands for subtraction or syzygy. Example x 2 y 2 x 2 ( x 2 − y 3 ) − x 2 y 2 S ( x 2 − y 3 , xy 2 + x ) xy 2 ( xy 2 + x ) = y 2 ( x 2 − y 3 ) − x ( xy 2 + x ) = − x 2 − y 5 =

  23. Definition LT( g ) g where x γ is the least common x γ x γ Let S ( f , g ) = LT( f ) f − multiple of the leading monomials of f and g. This is the s-polynomial of f and g, where s stands for subtraction or syzygy. Example x 2 y 2 x 2 ( x 2 − y 3 ) − x 2 y 2 S ( x 2 − y 3 , xy 2 + x ) xy 2 ( xy 2 + x ) = y 2 ( x 2 − y 3 ) − x ( xy 2 + x ) = − x 2 − y 5 = Theorem (Buchberger’s Criterion) Let G = { g 1 , g 2 , . . . , g s } generate the ideal I. If S ( g i , g j ) G → 0 for all pairs g i , g j then G is a Gr¨ obner basis of I.

  24. Algorithm Buchberger’s Algorithm input a set of polynomials { f 1 , . . . , f k } output a Gr¨ obner basis G of I = � f 1 , . . . , f k � procedure Buchberger ( { f 1 , . . . , f k } ) G ← { f 1 , . . . , f k } ⊲ the current basis P ← { ( f i , f j ) | 1 ≤ i < j ≤ k } ⊲ the remaining pairs while | P | > 0 do ( f i , f j ) ← select ( P ) P ← P \ { ( f i , f j ) } r ← S ( f i , f j ) G if r � = 0 then P ← P ∪ { ( f , r ) : f ∈ G } G ← G ∪ { r } end if end while return G end procedure

  25. Example I = � x 2 − y 3 , xy 2 + x �

  26. Example I = � x 2 − y 3 , xy 2 + x � initialize G to { x 2 − y 3 , xy 2 + x } initialize P to { ( x 2 − y 3 , xy 2 + x ) }

  27. Example I = � x 2 − y 3 , xy 2 + x � initialize G to { x 2 − y 3 , xy 2 + x } initialize P to { ( x 2 − y 3 , xy 2 + x ) } select ( x 2 − y 3 , xy 2 + x ) and compute S ( x 2 − y 3 , xy 2 + x ) G → − y 5 − y 3 update G to { x 2 − y 3 , xy 2 + x , − y 5 − y 3 } update P to { ( x 2 − y 3 , − y 5 − y 3 ) , ( xy 2 + x , − y 5 − y 3 ) }

  28. Example I = � x 2 − y 3 , xy 2 + x � initialize G to { x 2 − y 3 , xy 2 + x } initialize P to { ( x 2 − y 3 , xy 2 + x ) } select ( x 2 − y 3 , xy 2 + x ) and compute S ( x 2 − y 3 , xy 2 + x ) G → − y 5 − y 3 update G to { x 2 − y 3 , xy 2 + x , − y 5 − y 3 } update P to { ( x 2 − y 3 , − y 5 − y 3 ) , ( xy 2 + x , − y 5 − y 3 ) } select ( x 2 − y 3 , − y 5 − y 3 ) and compute S ( x 2 − y 3 , − y 5 − y 3 ) G → 0

Recommend


More recommend