geometrically coupled monte carlo sampling
play

Geometrically Coupled Monte Carlo Sampling Mark Rowland Krzysztof - PowerPoint PPT Presentation

Geometrically Coupled Monte Carlo Sampling Mark Rowland Krzysztof Choromanski Franois Chalus Aldo Pacchiano Tamas Sarlos Richard E. Turner Adrian Weller Geometrically Coupled Monte Carlo Sampling Central goal: Unbiased Monte Carlo


  1. Geometrically Coupled Monte Carlo Sampling Mark Rowland Krzysztof Choromanski François Chalus Aldo Pacchiano Tamas Sarlos Richard E. Turner Adrian Weller

  2. Geometrically Coupled Monte Carlo Sampling Central goal: Unbiased Monte Carlo estimation: Can we do better than i.i.d.? Key contribution: K -optimality. Optimise the objective below over the joint distribution of INNOVATION + ASSISTANCE This leads to a multi-marginal transport problem, which is often analytically solvable. 2

  3. GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 3

  4. GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 4

  5. GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 5

  6. GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 6

  7. GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution INNOVATION + ASSISTANCE 7

  8. GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution Typical approach to Monte Carlo Sampling: INNOVATION + ASSISTANCE ● Independent Antithetic Pairs ● Coupled Samples of Equal Lengths 8

  9. GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution Typical approach to Monte Carlo Sampling: INNOVATION + ASSISTANCE ● Independent Antithetic Pairs ● Coupled Samples of Equal Lengths 9

  10. GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution Typical approach to Monte Carlo Sampling: GCMC: INNOVATION + ASSISTANCE ● Independent Antithetic Pairs ● orthogonal directions ● Coupled Samples of Equal Lengths of different antithetic pairs ● correlated unequal lengths within a pair ● variance reduction 10

  11. GCMC for Policy Search - Details INNOVATION + ASSISTANCE 11

  12. GCMC for Policy Search INNOVATION + ASSISTANCE 12

  13. GCMC for Policy Search INNOVATION + ASSISTANCE 13

  14. GCMC for Policy Search Towards smooth relaxations INNOVATION + ASSISTANCE Gaussian smoothings 14

  15. GCMC for Policy Search Towards smooth relaxations INNOVATION + ASSISTANCE Gaussian smoothing gradient 15

  16. GCMC for Policy Search Towards smooth relaxations INNOVATION + ASSISTANCE Gaussian smoothing gradient 16

  17. Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): INNOVATION + ASSISTANCE 17

  18. Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): INNOVATION + ASSISTANCE 18

  19. Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): Antithetic inverse lengths coupling estimator (Rowland, Choromanski et al. 2018): INNOVATION + ASSISTANCE 19

  20. Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): Antithetic inverse lengths coupling estimator (Rowland, Choromanski et al. 2018): INNOVATION + ASSISTANCE coupled lengths coupled lengths 20

  21. Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): Antithetic inverse lengths coupling estimator (Rowland, Choromanski et al. 2018): INNOVATION + ASSISTANCE coupled lengths 21

  22. Experimental results: Minitaur Learning How to Walk with antithetic coupled samples + linear policies N=8 N=16 N=48 INNOVATION + ASSISTANCE 22 N=54 N=64 N=96

  23. Thank you !!! 23 INNOVATION + ASSISTANCE

Recommend


More recommend