Geometrically Coupled Monte Carlo Sampling Mark Rowland Krzysztof Choromanski François Chalus Aldo Pacchiano Tamas Sarlos Richard E. Turner Adrian Weller
Geometrically Coupled Monte Carlo Sampling Central goal: Unbiased Monte Carlo estimation: Can we do better than i.i.d.? Key contribution: K -optimality. Optimise the objective below over the joint distribution of INNOVATION + ASSISTANCE This leads to a multi-marginal transport problem, which is often analytically solvable. 2
GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 3
GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 4
GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 5
GCMC in Robotics - Policy Search - An Overview isotropic distribution INNOVATION + ASSISTANCE 6
GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution INNOVATION + ASSISTANCE 7
GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution Typical approach to Monte Carlo Sampling: INNOVATION + ASSISTANCE ● Independent Antithetic Pairs ● Coupled Samples of Equal Lengths 8
GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution Typical approach to Monte Carlo Sampling: INNOVATION + ASSISTANCE ● Independent Antithetic Pairs ● Coupled Samples of Equal Lengths 9
GCMC in Robotics - Policy Search - An Overview antithetic pair isotropic distribution Typical approach to Monte Carlo Sampling: GCMC: INNOVATION + ASSISTANCE ● Independent Antithetic Pairs ● orthogonal directions ● Coupled Samples of Equal Lengths of different antithetic pairs ● correlated unequal lengths within a pair ● variance reduction 10
GCMC for Policy Search - Details INNOVATION + ASSISTANCE 11
GCMC for Policy Search INNOVATION + ASSISTANCE 12
GCMC for Policy Search INNOVATION + ASSISTANCE 13
GCMC for Policy Search Towards smooth relaxations INNOVATION + ASSISTANCE Gaussian smoothings 14
GCMC for Policy Search Towards smooth relaxations INNOVATION + ASSISTANCE Gaussian smoothing gradient 15
GCMC for Policy Search Towards smooth relaxations INNOVATION + ASSISTANCE Gaussian smoothing gradient 16
Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): INNOVATION + ASSISTANCE 17
Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): INNOVATION + ASSISTANCE 18
Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): Antithetic inverse lengths coupling estimator (Rowland, Choromanski et al. 2018): INNOVATION + ASSISTANCE 19
Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): Antithetic inverse lengths coupling estimator (Rowland, Choromanski et al. 2018): INNOVATION + ASSISTANCE coupled lengths coupled lengths 20
Coupled antithetic pairs for Monte Carlo gradient estimation Baseline gradient estimator with antithetic pairs (Salimans et al. 2017): Antithetic inverse lengths coupling estimator (Rowland, Choromanski et al. 2018): INNOVATION + ASSISTANCE coupled lengths 21
Experimental results: Minitaur Learning How to Walk with antithetic coupled samples + linear policies N=8 N=16 N=48 INNOVATION + ASSISTANCE 22 N=54 N=64 N=96
Thank you !!! 23 INNOVATION + ASSISTANCE
Recommend
More recommend