learning to branch
play

Learning to Branch Ellen Vitercik Joint work with Nina Balcan, - PowerPoint PPT Presentation

Learning to Branch Ellen Vitercik Joint work with Nina Balcan, Travis Dick, and Tuomas Sandholm Published in ICML 2018 1 Integer Programs (IPs) a maximize subject to {0,1} 2 Facility location


  1. Learning to Branch Ellen Vitercik Joint work with Nina Balcan, Travis Dick, and Tuomas Sandholm Published in ICML 2018 1

  2. Integer Programs (IPs) a 𝒅 βˆ™ π’š maximize π΅π’š ≀ 𝒄 subject to π’š ∈ {0,1} π‘œ 2

  3. Facility location problems can be formulated as IPs. 3

  4. Clustering problems can be formulated as IPs. 4

  5. Binary classification problems can be formulated as IPs. 5

  6. Integer Programs (IPs) a 𝒅 βˆ™ π’š maximize π΅π’š = 𝒄 subject to π’š ∈ {0,1} π‘œ NP-hard 6

  7. Branch and Bound (B&B) β€’ Most widely-used algorithm for IP-solving (CPLEX, Gurobi) β€’ Recursively partitions search space to find an optimal solution β€’ Organizes partition as a tree β€’ Many parameters β€’ CPLEX has a 221-page manual describing 135 parameters β€œYou may need to experiment.” 7

  8. Why is tuning B&B parameters important? β€’ Save time β€’ Solve more problems β€’ Find better solutions 8

  9. B&B in the real world Delivery company routes trucks daily Use integer programming to select routes Demand changes every day Solve hundreds of similar optimizations Using this set of typical problems… can we learn best parameters? 9

  10. Model 𝐡 1 , 𝒄 1 , 𝒅 1 , … , 𝐡 𝑛 , 𝒄 𝑛 , 𝒅 𝑛 Application- Specific B&B parameters Algorithm Distribution Designer How to use samples to find best B&B parameters for my domain? 10

  11. Model 𝐡 1 , 𝒄 1 , 𝒅 1 , … , 𝐡 𝑛 , 𝒄 𝑛 , 𝒅 𝑛 Application- Specific B&B parameters Algorithm Distribution Designer Model has been studied in applied communities [Hutter et al. β€˜09] 11

  12. Model 𝐡 1 , 𝒄 1 , 𝒅 1 , … , 𝐡 𝑛 , 𝒄 𝑛 , 𝒅 𝑛 Application- Specific B&B parameters Algorithm Distribution Designer Model has been studied from a theoretical perspective [Gupta and Roughgarden β€˜16, Balcan et al., β€˜17] 12

  13. Model 1. Fix a set of B&B parameters to optimize 2. Receive sample problems from unknown distribution 𝐡 1 , 𝒄 1 , 𝒅 1 𝐡 2 , 𝒄 2 , 𝒅 2 3. Find parameters with the best performance on the samples β€œBest” could mean smallest search tree, for example 13

  14. Questions to address How to find parameters that are best on average over samples? 𝐡 1 , 𝒄 1 , 𝒅 1 𝐡 2 , 𝒄 2 , 𝒅 2 𝐡, 𝒄, 𝒅 ? Will those parameters have high performance in expectation? 14

  15. Outline 1. Introduction 2. Branch-and-Bound 3. Learning algorithms 4. Experiments 5. Conclusion and Future Directions 15

  16. (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. π’š ∈ {0,1} 7 16

  17. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 17

  18. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 1. Choose leaf of tree 18

  19. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 19

  20. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 20

  21. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 𝑦 2 = 0 𝑦 2 = 1 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 2 , 1 3 120 120 21

  22. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 𝑦 2 = 0 𝑦 2 = 1 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 2 , 1 3 120 120 22

  23. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 𝑦 6 = 0 𝑦 6 = 1 𝑦 2 = 0 𝑦 2 = 1 0, 1, 1 0, 3 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 3 , 1, 0, 0, 1 5 , 0, 0, 0, 1, 1 2 , 1 3 133.3 116 120 120 23

  24. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 𝑦 6 = 0 𝑦 6 = 1 𝑦 2 = 0 𝑦 2 = 1 0, 1, 1 0, 3 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 3 , 1, 0, 0, 1 5 , 0, 0, 0, 1, 1 2 , 1 3 133.3 116 120 120 24

  25. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 𝑦 6 = 0 𝑦 6 = 1 𝑦 2 = 0 𝑦 2 = 1 0, 1, 1 0, 3 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 3 , 1, 0, 0, 1 5 , 0, 0, 0, 1, 1 2 , 1 3 133.3 116 120 120 𝑦 3 = 0 𝑦 3 = 1 0, 4 5 , 1, 0, 0, 0, 1 0, 1, 0, 1, 1, 0, 1 133 118 25

  26. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 3. Fathom leaf if: i. LP relaxation solution is 𝑦 6 = 0 𝑦 6 = 1 𝑦 2 = 0 𝑦 2 = 1 integral ii. LP relaxation is infeasible 0, 1, 1 0, 3 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 3 , 1, 0, 0, 1 5 , 0, 0, 0, 1, 1 2 , 1 iii. LP relaxation solution 3 isn’t better than best - 133.3 116 120 120 known integral solution 𝑦 3 = 0 𝑦 3 = 1 0, 4 5 , 1, 0, 0, 0, 1 0, 1, 0, 1, 1, 0, 1 133 118 26

  27. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 3. Fathom leaf if: i. LP relaxation solution 𝑦 6 = 0 𝑦 6 = 1 𝑦 2 = 0 𝑦 2 = 1 is integral ii. LP relaxation is infeasible 0, 1, 1 0, 3 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 3 , 1, 0, 0, 1 5 , 0, 0, 0, 1, 1 2 , 1 iii. LP relaxation solution 3 isn’t better than best - 133.3 116 120 120 known integral solution 𝑦 3 = 0 𝑦 3 = 1 0, 4 5 , 1, 0, 0, 0, 1 0, 1, 0, 1, 1, 0, 1 133 118 Integral 27

  28. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 B&B 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 140 π’š ∈ {0,1} 7 𝑦 1 = 0 𝑦 1 = 1 1. Choose leaf of tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 2. Branch on a variable 135 136 3. Fathom leaf if: i. LP relaxation solution is 𝑦 6 = 0 𝑦 6 = 1 𝑦 2 = 0 𝑦 2 = 1 integral ii. LP relaxation is infeasible 0, 1, 1 0, 3 1, 0, 0, 1, 0, 1 1, 1, 0, 0, 0, 0, 1 3 , 1, 0, 0, 1 5 , 0, 0, 0, 1, 1 2 , 1 iii. LP relaxation solution 3 isn’t better than best - 133.3 116 120 120 known integral solution 𝑦 3 = 0 𝑦 3 = 1 0, 4 5 , 1, 0, 0, 0, 1 0, 1, 0, 1, 1, 0, 1 133 118 28

  29. B&B 1. Choose leaf of tree This talk: How to choose which variable? 2. Branch on a variable (Assume every other aspect of B&B is fixed.) 3. Fathom leaf if: i. LP relaxation solution is integral ii. LP relaxation is infeasible iii. LP relaxation solution isn’t better than best - known integral solution 29

  30. Variable selection policies can have a huge effect on tree size 30

  31. Outline 1. Introduction 2. Branch-and-Bound a. Algorithm Overview b. Variable Selection Policies 3. Learning algorithms 4. Experiments 5. Conclusion and Future Directions 31

  32. Variable selection policies (VSPs) 1, 3 5 , 0, 0, 0, 0, 1 Score-based VSP: 136 At leaf 𝑹 , branch on variable π’š 𝒋 maximizing 𝐭𝐝𝐩𝐬𝐟 𝑹, 𝒋 𝑦 2 = 0 𝑦 2 = 1 1 1 1, 0, 0, 1, 0, 2 , 1 1, 1, 0, 0, 0, 0, 3 120 120 Many options! Little known about which to use when 32

  33. Variable selection policies For an IP instance 𝑅 : β€’ Let 𝑑 𝑅 be the objective value of its LP relaxation βˆ’ be 𝑅 with 𝑦 𝑗 set to 0, and let 𝑅 𝑗 + be 𝑅 with 𝑦 𝑗 set to 1 β€’ Let 𝑅 𝑗 Example. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 𝑅 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 𝑑 𝑅 π’š ∈ {0,1} 7 140 33

  34. Variable selection policies For an IP instance 𝑅 : β€’ Let 𝑑 𝑅 be the objective value of its LP relaxation βˆ’ be 𝑅 with 𝑦 𝑗 set to 0, and let 𝑅 𝑗 + be 𝑅 with 𝑦 𝑗 set to 1 β€’ Let 𝑅 𝑗 Example. 1 (40, 60, 10, 10, 3, 20, 60) βˆ™ π’š max 2 , 1, 0, 0, 0, 0, 1 𝑅 40, 50, 30, 10, 10, 40, 30 βˆ™ π’š ≀ 100 s.t. 𝑑 𝑅 π’š ∈ {0,1} 7 140 𝑦 1 = 0 𝑦 1 = 1 0, 1, 0, 1, 0, 1 1, 3 4 , 1 5 , 0, 0, 0, 0, 1 𝑑 𝑅 1 𝑑 𝑅 1 βˆ’ 135 136 + 34

Recommend


More recommend