pushing data into cp models using graphical model
play

Pushing data into CP models using Graphical Model Learning & - PowerPoint PPT Presentation

Pushing data into CP models using Graphical Model Learning & Solving CP 2020 CP and ML track September 2020 Cline Brouard 1 , S. de Givry 2 & T. Schiex 2 1 Universit Fdrale de Toulouse, INRAE MIAT, UR 875, Toulouse, France 2


  1. Pushing data into CP models using Graphical Model Learning & Solving CP 2020 CP and ML track September 2020 Céline Brouard 1 , S. de Givry 2 & T. Schiex 2 1 Université Fédérale de Toulouse, INRAE MIAT, UR 875, Toulouse, France 2 Université Fédérale de Toulouse, ANITI, INRAE MIAT, UR 875, Toulouse, France

  2. Learning a Cost Function Network from high-quality solutions 1 14

  3. Please, stay with us and… You’ll learn how we use graphical models to connect CP with probabilistic Machine Learning how the NP-hard regularization loop can be made practical how we learn playing the Sudoku from images (without rules) how it compares with DL architectures that “learn to reason” how we can combine learned user preferences with (car) configuration constraints 2 14

  4. Please, stay with us and… You’ll learn how we use graphical models to connect CP with probabilistic Machine Learning how the NP-hard regularization loop can be made practical how we learn playing the Sudoku from images (without rules) how it compares with DL architectures that “learn to reason” how we can combine learned user preferences with (car) configuration constraints 2 14

  5. Please, stay with us and… You’ll learn how we use graphical models to connect CP with probabilistic Machine Learning how the NP-hard regularization loop can be made practical how we learn playing the Sudoku from images (without rules) how it compares with DL architectures that “learn to reason” how we can combine learned user preferences with (car) configuration constraints 2 14

  6. Please, stay with us and… You’ll learn how we use graphical models to connect CP with probabilistic Machine Learning how the NP-hard regularization loop can be made practical how we learn playing the Sudoku from images (without rules) how it compares with DL architectures that “learn to reason” how we can combine learned user preferences with (car) configuration constraints 2 14

  7. Please, stay with us and… You’ll learn how we use graphical models to connect CP with probabilistic Machine Learning how the NP-hard regularization loop can be made practical how we learn playing the Sudoku from images (without rules) how it compares with DL architectures that “learn to reason” how we can combine learned user preferences with (car) configuration constraints 2 14

  8. Graphical Models max. size d 3 Weighted Constraint Satisfaction Problem Joint cost function What is it ? a set C of cost functions 14 n variables a set V of variables (unbounded) A description of a multivariate function as the combination of small functions Cost Function Network M variable X ∈ V has domain D X � D X → ¯ c S ∈ C : Z ( ∞ ) X ∈ S � C M ( v ) = c S ( v [ S ]) c S ∈ C

  9. Graphical Models max. size d 3 Weighted Constraint Satisfaction Problem Joint cost function What is it ? a set C of cost functions 14 n variables a set V of variables (unbounded) A description of a multivariate function as the combination of small functions Cost Function Network M variable X ∈ V has domain D X � D X → ¯ c S ∈ C : Z ( ∞ ) X ∈ S � C M ( v ) = c S ( v [ S ]) c S ∈ C

  10. Graphical Models max. size d 3 Weighted Constraint Satisfaction Problem Joint cost function What is it ? a set C of cost functions 14 n variables a set V of variables (unbounded) A description of a multivariate function as the combination of small functions Cost Function Network M variable X ∈ V has domain D X � D X → ¯ c S ∈ C : Z ( ∞ ) X ∈ S � C M ( v ) = c S ( v [ S ]) c S ∈ C

  11. What do we want to learn ? Definition (Learning a pairwise CFN from high quality solutions) Given: a set of variables V , a set of assignments E i.i.d. from an unknown distribution of high-quality solutions Pairwise CFN with cost-tables A constant table can be ignored. 4 14 Find a pairwise CFN M that can be solved to produce high-quality solutions n ( n − 1) tables of d 2 costs + n tables of d costs 2

  12. What do we want to learn ? Definition (Learning a pairwise CFN from high quality solutions) Given: a set of variables V , a set of assignments E i.i.d. from an unknown distribution of high-quality solutions Pairwise CFN with cost-tables A constant table can be ignored. 4 14 Find a pairwise CFN M that can be solved to produce high-quality solutions n ( n − 1) tables of d 2 costs + n tables of d costs 2

  13. Stochastic Graphical Models Joint function and probability distribution 5 (up to some precision) From products to sum and back 14 a set V of domain variables Markov Random Field M a set Φ of potential functions � D X → R + ϕ S ∈ Φ : X ∈ S � Φ M ( v ) = ϕ S ( v [ S ]) P M ( v ) ∝ Φ M ( v ) ϕ S ∈ Φ CFN M ℓ MRF M − − − − − → − − − − − → MRF M − log ( x ) exp ( − x )

  14. Stochastic Graphical Models Joint function and probability distribution 5 (up to some precision) From products to sum and back 14 a set V of domain variables Markov Random Field M a set Φ of potential functions � D X → R + ϕ S ∈ Φ : X ∈ S � Φ M ( v ) = ϕ S ( v [ S ]) P M ( v ) ∝ Φ M ( v ) ϕ S ∈ Φ CFN M ℓ MRF M − − − − − → − − − − − → MRF M − log ( x ) exp ( − x )

  15. Stochastic Graphical Models Joint function and probability distribution 5 (up to some precision) From products to sum and back 14 a set V of domain variables Markov Random Field M a set Φ of potential functions � D X → R + ϕ S ∈ Φ : X ∈ S � Φ M ( v ) = ϕ S ( v [ S ]) P M ( v ) ∝ Φ M ( v ) ϕ S ∈ Φ CFN M ℓ MRF M − − − − − → − − − − − → MRF M − log ( x ) exp ( − x )

  16. Maximum loglikelihood for CFN learning Maximum likelihood estimation from i.i.d. sample E 6 Soft-Min of all assignment costs -costs of E samples 14 Likelihood of M : probability of E under M Maximum likelihood M : a MRF M that gives maximum probability to E . Maximum loglikelihood M on M ℓ = log ( � v ∈ E P M ( v )) = � L ( M , E ) v ∈ E log ( P M ( v )) = � v ∈ E log (Φ M ( v )) − log ( Z M ) � � = ( − C M ℓ ( v )) − log ( exp ( − C M ℓ ( t ))) v ∈ E t ∈ � x ∈ V D X � �� � � �� �

  17. Maximum loglikelihood for CFN learning Maximum likelihood estimation from i.i.d. sample E 6 Soft-Min of all assignment costs -costs of E samples 14 Likelihood of M : probability of E under M Maximum likelihood M : a MRF M that gives maximum probability to E . Maximum loglikelihood M on M ℓ = log ( � v ∈ E P M ( v )) = � L ( M , E ) v ∈ E log ( P M ( v )) = � v ∈ E log (Φ M ( v )) − log ( Z M ) � � = ( − C M ℓ ( v )) − log ( exp ( − C M ℓ ( t ))) v ∈ E t ∈ � x ∈ V D X � �� � � �� �

  18. Regularized approximate max-log-likelihood estimation Regularized Log-Likelihood estimation avoids over-fituing by pushing non essential costs to zero: learns scopes. PE MRF: ADMM optimized convex approximation of regularized loglikelihood 1 statistically sparsistent provides a CFN as output 1 Youngsuk Park et al. “Learning the network structure of heterogeneous data via pairwise exponential Markov random fields”. In: Proceedings of machine learning research 54 (2017), p. 1302. 7 14 penalizes log-likelihood proportionally to the L 1 norm of the costs learned ( λ ) avoids #P-completeness using a concave approximation of Z M

  19. Regularized approximate max-log-likelihood estimation Regularized Log-Likelihood estimation avoids over-fituing by pushing non essential costs to zero: learns scopes. PE MRF: ADMM optimized convex approximation of regularized loglikelihood 1 statistically sparsistent provides a CFN as output 1 Youngsuk Park et al. “Learning the network structure of heterogeneous data via pairwise exponential Markov random fields”. In: Proceedings of machine learning research 54 (2017), p. 1302. 7 14 penalizes log-likelihood proportionally to the L 1 norm of the costs learned ( λ ) avoids #P-completeness using a concave approximation of Z M

  20. Using empirical risk minimization for each sample v in the validation set assign a fraction of v and solve with a WCSP solver Controlling PyToulbar2 NP-hard optimization efgort bounded optimization efgort (backtrack, time, gap. Here: 50,000 backtracks) controllable fraction of v assigned Empirical hardening 8 14 Selecting a suitable value of λ prefer λ that gives solutions close to v Set positive costs that are never violated in the training/validation sets to ∞ .

  21. Using empirical risk minimization for each sample v in the validation set assign a fraction of v and solve with a WCSP solver Controlling PyToulbar2 NP-hard optimization efgort bounded optimization efgort (backtrack, time, gap. Here: 50,000 backtracks) controllable fraction of v assigned Empirical hardening 8 14 Selecting a suitable value of λ prefer λ that gives solutions close to v Set positive costs that are never violated in the training/validation sets to ∞ .

  22. Using empirical risk minimization for each sample v in the validation set assign a fraction of v and solve with a WCSP solver Controlling PyToulbar2 NP-hard optimization efgort bounded optimization efgort (backtrack, time, gap. Here: 50,000 backtracks) controllable fraction of v assigned Empirical hardening 8 14 Selecting a suitable value of λ prefer λ that gives solutions close to v Set positive costs that are never violated in the training/validation sets to ∞ .

Recommend


More recommend