PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Resampling methods: Partial conclusion Conclusion: Adaptation of Newton’s algorithm for noisy fitness ( ∇ f and Hf approximated by finite differences+resamplings) → leads to fast convergence rates + recovers many rates in one alg. + generic framework (but no proved application besides quadratic surrogate model) Non-adaptive methods lead to log - log convergence (math+xp) in ES N scale = ⌈ d − 2 exp ( 4 n 5 d ) ⌉ ok ( slope ( SR ) = − 1 2 ) for both ES and DE (nb: − 1 possible with large mutation + small inheritance) In progress: Adaptive resampling methods might be merged with bounds on resampling numbers ⇒ in progress, unclear benefit for the moment. 17 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 18 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Variance reduction techniques Monte Carlo [Hammersley and Handscomb, 1964, Billingsley, 1986] n E f ( x , ω ) = 1 � ˆ f ( x , ω i ) → E ω f ( x , ω ) . (7) n i = 1 Quasi Monte Carlo [Cranley and Patterson, 1976, Niederreiter, 1992, Wang and Hickernell, 2000, Mascagni and Chi, 2004] Use samples aimed at being as uniform as possible over the domain. 19 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Variance reduction techniques: white-box Antithetic variates Ensure some regularity of the sampling by using symmetries � n / 2 ˆ E ω f ( x , ω )= 1 i = 1 ( f ( x , ω i ) + f ( x , − ω i )) . n Importance sampling Instead of sampling ω with density dP , we sample ω ′ with density dP ′ � n ˆ dP ( ω i ) E ω f ( x , ω )= 1 dP ′ ( ω i ) f ( x , ω i ) . i = 1 n Control variates Instead of estimating E ω f ( x , ω ) , we estimate E ω ( f ( x , ω ) − g ( x , ω )) + E ω ( f ( x , ω ) − g ( x , ω )) using E ω f ( x , ω ) = E ω g ( x , ω ) . � �� � � �� � A B 20 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Variance reduction techniques: grey-box Common random numbers (CRN) or pairing Use the same samples ω 1 , . . . , ω n for all the population x n , 1 , . . . , x n ,λ . Seed n = { seed n , 1 , . . . , seed n , m n } . E ω f ( x n , k , ω ) is then approximated as m n 1 � f ( x n , k , seed n , i ) . m n i = 1 Different forms of pairing : Seed n is the same for all n m n increases and nested sets Seed n , i.e. ∀ n , i ≤ m n , m n + 1 ≥ m n , seed n , i = seed n + 1 , i all individuals in an offspring use the same seeds, + seeds are 100% changed between offspring 21 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Pairing: Partial conclusion No details, just our conclusion: “almost” black-box easy to implement applicable for most applications On the realistic problem, pairing provided a great improvement But there are counterexamples in which it is detrimental. 22 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio: state of the art Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 23 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio: state of the art Portfolio of optimization algorithms Usually: Portfolio → Combinatorial Optimization (SAT Competition) Recently: Portfolio → Continuous Optimization [Baudiˇ s and Poˇ s´ ık, 2014] This work: Portfolio → Noisy Optimization ֒ → Portfolio = choosing, online, between several algorithms 24 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Relationship between portfolio and noisy optimization Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 25 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Relationship between portfolio and noisy optimization Why portfolio in Noisy Optimization? Stochastic problem limited budget (time or total number of evaluations) target: anytime convergence to the optimum black-box 2 How to choose a suitable solver? 2 Image from http://ethanclements.blogspot.fr/2010/12/postmodernism-essay-question.html 26 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Relationship between portfolio and noisy optimization Why portfolio in Noisy Optimization? Stochastic problem limited budget (time or total number of evaluations) target: anytime convergence to the optimum black-box 2 How to choose a suitable solver? Algorithm Portfolios: Select automatically the best in a finite set of solvers 2 Image from http://ethanclements.blogspot.fr/2010/12/postmodernism-essay-question.html 26 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 27 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: proposal A finite number of given noisy optimization solvers, “orthogonal” Unfair distribution of budget Information sharing (not very helpful here...) → Performs almost as well as the best solver 28 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: NOPA Algorithm 1 Noisy Optimization Portfolio Algorithm (NOPA). 1: Input noisy optimization solvers Solver 1 , Solver 2 . . . , Solver M 2: Input a lag function LAG : N + �→ N + 3: Input a non-decreasing integer sequence r 1 , r 2 , . . . ◮ Periodic comparisons 4: Input a non-decreasing integer sequence s 1 , s 2 , . . . ◮ Number of resamplings 5: n ← 1 ◮ Number of selections 6: m ← 1 ◮ NOPA’s iteration number 7: i ∗ ← null ◮ Index of recommended solver 8: x ∗ ← null ◮ Recommendation 9: while budget is not exhausted do if m ≥ r n then 10: i ∗ = arg min ˆ E s n [ f (˜ x i , LAG ( r n ) )] ◮ Algorithm selection 11: i ∈{ 1 ,..., M } n ← n + 1 12: else 13: for i ∈ { 1 , . . . , M } do 14: Apply one evaluation for Solver i 15: end for 16: m ← m + 1 17: end if 18: x ∗ = ˜ x i ∗ , m ◮ Update recommendation 19: 20: end while 29 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: compare solvers early lag function: LAG ( n ) ≤ n : lag ∀ i ∈ { 1 , . . . , M } , x i , LAG ( n ) � = or = x i , n 30 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: compare solvers early lag function: LAG ( n ) ≤ n : lag ∀ i ∈ { 1 , . . . , M } , x i , LAG ( n ) � = or = x i , n Why this lag ? algorithms’ ranking is usually stable → no use comparing the very last it’s much cheaper to compare old points: comparing good (i.e. recent) points → comparing points with similar fitness comparing points with similar fitness → very expensive 30 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: Theorem with fair budget distribution Theorem with fair budget distribution Assume that each solver i ∈ { 1 , . . . , M } has simple regret SR i , n = ( 1 + o ( 1 )) C i n α i (as usual) and noise variance = constant. Then for some universal r n , s n , LAG n , a.s. there exists n 0 such that, for n ≥ n 0 : portfolio always chooses an optimal solver (optimal α i and C i ); the portfolio uses ≤ M · r n ( 1 + o ( 1 )) evaluations ⇒ M times more than the best solver. Interpretation Negligible comparison budget (thanks to lag) On classical log - log graphs, the portfolio should perform similarly to the best solver, within the log ( M ) shift (proved) 31 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods INOPA: introducing an unfair budget NOPA: same budget for all solvers. Remark: we compare old recommendations ( LAG n << n ) they were known long ago, before spending all this budget therefore, except selected solvers, most of the budget is wasted :( ⇒ Lazy evaluation paradigm : evaluate f ( . ) only when you need it for your output ⇒ Improved NOPA (INOPA) : unfaired budget distribution Use only LAG ( r n ) evaluations (negligible) on the sub-optimal solvers (INOPA) log ( M ′ ) shift with M ′ the number of optimal solvers (proved) 32 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Experiments: Unimodal case Noisy Optimization Algorithms (NOAs): SA-ES : Self-Adaptive Evolution Strategy Fabian ’s algorithm: a first-order method using gradients estimated by finite differences [Dvoretzky et al., 1956, Fabian, 1967] Noisy Newton ’s algorithm: a second-order method using a Hessian matrix approximated also by finite differences (our contribution in CNO) Solvers z = 0 (constant var) z = 1 (linear var) z = 2 (quadratic var) RSAES . 114 ± . 002 . 118 ± . 003 . 113 ± . 003 Fabian 1 − . 838 ± . 003 − 1 . 011 ± . 003 − 1 . 016 ± . 003 Fabian 2 . 108 ± . 003 − 1 . 339 ± . 003 − 2 . 481 ± . 003 Newton − . 070 ± . 003 − . 959 ± . 092 − 2 . 503 ± . 285 NOPA no lag − . 377 ± . 048 − . 978 ± . 013 − 2 . 106 ± . 003 NOPA − . 747 ± . 003 − . 937 ± . 005 − 2 . 515 ± . 095 INOPA − . 822 ± . 003 − 1 . 359 ± . 027 − 3 . 528 ± . 144 Table: Slope ( SR ) for f ( x ) = || x || 2 + || x || z N in dimension 15. Computation time = 40 s . 33 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Experiments: Stochastic unit commitment problem Solver d = 45 d = 63 d = 105 d = 125 RSAES . 485 ± . 071 . 870 ± . 078 . 550 ± . 097 . 274 ± . 097 Fabian 1 1 . 339 ± . 043 1 . 895 ± . 040 1 . 075 ± . 047 . 769 ± . 047 Fabian 2 . 394 ± . 058 . 521 ± . 083 . 436 ± . 097 . 307 ± . 097 Newton . 749 ± . 101 1 . 138 ± . 128 . 590 ± . 147 . 312 ± . 147 . 394 ± . 059 . 547 ± . 080 . 242 ± . 101 . 242 ± . 101 INOPA Table: Stochastic unit commitment problem (minimization). Computation time = 320 s . What’s more: Given a same budget, a INOPA of identical solvers can outperform its mono-solvers. 34 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Conclusion Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 35 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Conclusion Portfolio and noisy optimization: Conclusion Main conclusion: portfolios also great in noisy opt. (because in noisy opt., with lag, comparison cost = small) We show mathematically and empirically a log ( M ) shift when using M solvers, on a classical log - log scale Bound improved to log ( M ′ ) shift, with M ′ = nb. of optimal solvers, with unfair distribution of budget (INOPA) 36 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Conclusion Portfolio and noisy optimization: Conclusion Main conclusion: portfolios also great in noisy opt. (because in noisy opt., with lag, comparison cost = small) We show mathematically and empirically a log ( M ) shift when using M solvers, on a classical log - log scale Bound improved to log ( M ′ ) shift, with M ′ = nb. of optimal solvers, with unfair distribution of budget (INOPA) Take-home messages portfolio = little overhead unfair budget = no overhead if “orthogonal” portfolio (orthogonal → M ′ = 1) We mathematically confirmed the idea of orthogonality found in [Samulowitz and Memisevic, 2007] 36 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 37 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Framework: Zero-sum matrix games Game defined by matrix M I choose (privately) i Simultaneously, you choose (privately) j I earn M i , j You earn − M i , j So this is zero-sum . rock paper scissors rock 0.5 0 1 paper 1 0.5 0 scissors 0 1 0.5 Table: Example of 1-sum matrix game: Rock-paper-scissors. Figure: 0-sum matrix game. 38 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Framework: Nash Equilibrium (NE) Definition (Nash Equilibrium) Zero-sum matrix game M My strategy = probability distrib. on rows = x Your strategy = probability distrib. on cols = y Expected reward = x T My There exists x ∗ , y ∗ such that ∀ x , y , x T My ∗ ≤ x ∗ T My ∗ ≤ x ∗ T My . (8) ( x ∗ , y ∗ ) is a Nash Equilibrium (no unicity). Definition (Approximate ǫ -Nash Equilibria) ( x ∗ , y ∗ ) such that x T My ∗ − ǫ ≤ x ∗ T My ∗ ≤ x ∗ T My + ǫ. (9) Example: The NE of Rock-paper-scissors is unique: ( 1 / 3 , 1 / 3 , 1 / 3 ) . 39 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 40 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Methods for computing Nash Equilibrium Algorithm Complexity Exact solution? Confidence Time O ( K α ) , α > 6 LP [von Stengel, 2002] yes 1 constant O ( K log ( K ) [Grigoriadis and Khachiyan, 1995] ) no 1 random ǫ 2 O ( log2 ( K ) ) ǫ 2 [Grigoriadis and Khachiyan, 1995] no 1 random K with log ( K ) processors O ( K log ( K ) EXP3 [Auer et al., 1995] ) no 1 − δ constant ǫ 2 O ( K log ( K ) Inf [Audibert and Bubeck, 2009] ) no 1 − δ constant ǫ 2 O ( k 3 k K log K ) Our algorithm yes 1 − δ constant (if NE is k -sparse) Table: State-of-the-art of computing Nash Equilibrium for ESMG M K × K . 41 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Adversarial bandit algorithm Exp 3 . P Algorithm 2 Exp 3 . P : variant of Exp 3. η and γ are two parameters. 1: Input η ∈ R ◮ how much the distribution becomes peaked 2: Input γ ∈ ( 0 , 1 ] ◮ exploration rate 3: Input a time horizon (computational budget) T ∈ N + and the number of arms K ∈ N + 4: Output a Nash-optimal policy p 5: y ← 0 6: for i ← 1 to K do ◮ initialization ω i ← exp ( ηγ � T 7: K ) 3 8: end for 9: for t ← 1 to T do 10: for i ← 1 to K do ω i + γ 11: p i ← ( 1 − γ ) � K K j = 1 ω j 12: end for 13: Generate i t according to ( p 1 , p 2 , . . . , p K ) 14: Compute reward R it , t 15: for i ← 1 to K do 16: if i == i t then Rit , t ˆ 17: R i ← pi 18: else ˆ 19: R i ← 0 20: end if � � γ η 3 K (ˆ 21: ω i ← ω i exp R i + TK ) √ pi 22: end for 23: end for 24: Return probability distribution ( p 1 , p 2 , . . . , p K ) 42 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 43 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Sparse Nash Equilibria (1/2) Considering x ∗ a Nash-optimal policy for ZSMG M K × K : Let us assume that x ∗ is unique and has at most k non-zero components (sparsity). Let us show that x ∗ is “discrete”: ( Remark : Nash = solution of linear programming problem) ⇒ x ∗ = also NE of a k × k submatrix: M ′ k × k ⇒ x ∗ = solution of LP in dimension k ⇒ x ∗ = solution of k lin. eq. with coefficients in {− 1 , 0 , 1 } ⇒ x ∗ = inv-matrix × vector ⇒ x ∗ = obtained by “cofactors / det matrix” ⇒ x ∗ has denominator at most k k 2 By Hadamard determinant bound [Hadamard, 1893], [Brenner and Cummings, 1972] 44 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Sparse Nash Equilibria (2/2) Computation of sparse Nash Equilibria Under assumption that the Nash is sparse : x ∗ is rational with “small” denominator (previous slide!) So let us compute an ǫ -Nash (with ǫ small enough!) (sublinear time!) And let us compute its closest approximation with “small denominator” (Hadamard) Two new algorithms for exact Nash: Rounding-EXP3 : switch to closest approximation Truncation-EXP3 : remove small components and work on the remaining submatrix (exact solving) (requested precision ≃ k − 3 k / 2 only ⇒ compl. k 3 k K log K ) 45 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 46 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Our proposal: Parameter-free adversarial bandit No details here; in short: We compare various existing parametrizations of EXP3 We select the best We add sparsity as follows: ( Tx i ) α for a budget of T rounds of EXP3, threshold = max T i ∈{ 1 ,..., m } ⇒ we get a parameter-free bandit for adversarial problems 47 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 48 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Average perf. Scenarios technological berakthrough CO2 penalization Simulator Robustness Maintain a connection Create new connection,... Average cost Policies Scenarios Policy R ( k, s ) Examples of scenario : CO2 penalization, gas curtailment in Eastern Europe, technological breakthrough Examples of policy : massive nuclear power plant building, massive renewable energies 49 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Nash-planning for scenario-based decision making Decision tools EXTRACTION EXTRACTION COMPUTATIONAL METHOD OF CRITICAL INTERPRETATION OF POLICIES COST SCENARIOS Nature decides later, Wald One One per policy K × S minimizing our reward Nature decides later, Savage One One per policy K × S maximizing our regret K ′ × S ′ Scenarios Handcrafted Handcrafted Human expertise Nature decides ( K + S ) × log ( K + S )( ∗ ) our proposal: Nash Nash-optimal Nash-optimal privately, before us Table: Comparison between several tools for decision under uncertainty. K = |K| and S = |S| . ⇒ in this case sparsity performs very well. (*)improved if sparse, by our previous result! Nash ⇒ fast selection of scenarios and options: sparsity both fastens the NE computation and makes the output more readable (smaller matrix) 50 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Testcase and parameterization We consider (big toy problem): 3 10 investment policies ( k ) 3 9 scenarios ( s ) reward: ( k , s ) �→ R ( k , s ) 51 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Testcase and parameterization We consider (big toy problem): 3 10 investment policies ( k ) 3 9 scenarios ( s ) reward: ( k , s ) �→ R ( k , s ) We use Nash Equilibria, for their principled nature (Nature decides first and privately! that’s reasonable, right ?) and low computational cost in large scale settings compute the equilibria thanks to EXP3 (tuned)... ... with sparsity, for improving the precision reducing the number of pure strategies in our recommendation (unreadable matrix otherwise!) 51 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Sparse-Nash algorithm Algorithm 3 The Sparse-Nash algorithm for solving decision under uncertainty prob- lems. Input A family K of possible decisions k (investment policies). Input A family S of scenarios s . Input A mapping ( k , s ) �→ R k , s , providing the rewards Run truncated . Exp 3 . P on R , get a probability distribution on K (support = key options) and a probability distribution on S (support = critical scenarios). Emphasize the policy with highest probability. 52 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Results AVERAGE SPARSITY LEVEL OVER 310 = 59049 ARMS α T = K T = 10 K T = 50 K T = 100 K T = 500 K T = 1000 K 13804 ± 52 0.1 non-sparse non-sparse non-sparse non-sparse non-sparse 0.3 2810 ± 59 non-sparse non-sparse non-sparse non-sparse non-sparse 0.5 396 ± 16 non-sparse non-sparse 59049 ± 197 49819 ± 195 non-sparse 0.7 43 ± 3 58925 ± 27 55383 ± 1507 46000 ± 278 9065 ± 160 non-sparse 0.9 4 ± 0 993 ± 64 797 ± 42 504 ± 25 98 ± 5 52633 ± 523 0.99 1 ± 0 2 ± 0 3 ± 0 2 ± 0 2 ± 0 7 ± 1 ROBUST SCORE: WORST REWARD AGAINST PURE STRATEGIES α T = K T = 10 K T = 50 K T = 100 K T = 500 K T = 1000 K NT 4.922e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01 0.1 4.948e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01 0.3 5.004e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01 0.5 5.059e-01 4.928e-01 4.956e-01 4.991e-01 5.242e-01 4.938e-01 0.7 5.054e-01 4.928e-01 4.965e-01 5.031e-01 5.317e-01 4.938e-01 0.9 4.281e-01 5.137e-01 5.151e-01 5.140e-01 5.487e-01 4.960e-01 0.99 3.634e-01 4.357e-01 4.612e-01 4.683e-01 5.242e-01 5.390e-01 Pure 3.505e-01 3.946e-01 4.287e-01 4.489e-01 5.143e-01 4.837e-01 Table: Average sparsity level and robust score. α is the truncation parameter. T is the budget. 53 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: summary Define long term scenarios (plenty!) ? Build simulator R ( k , s ) Classical solution (Savage): min k ∈K max s ∈S regret ( k , s ) Our proposal (Nash): automatically select submatrix 54 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: summary Define long term scenarios (plenty!) ? Build simulator R ( k , s ) Classical solution (Savage): min k ∈K max s ∈S regret ( k , s ) Our proposal (Nash): automatically select submatrix Our proposed tool has the following advantages: Natural extraction of interesting policies and critical scenarios: α = . 7 provides stable (and proved) results, but the extracted submatrix becomes easily readable (small enough) with larger values of α . Faster than Wald or Savage methodologies. Take-home messages We get a fast criterion, faster than Wald’s or Savage’s criteria, with a natural interpretation, and more readable ⇒ but stochastic recommendation! 54 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 55 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Two parts: Seeds matter: **choose** your seeds ! More tricky but worth the effort: position-specific seeds ! (towards a better asymptotic behavior of MCTS ?) 56 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing random seeds: Correlations Figure: Success rate per seed (ranked) in 5x5 Domineering, with standard deviations on y-axis: the seed has a significant impact . Fact: the random seed matters ! 57 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing random seeds: State-of-the-art Stochastic algorithms randomly select their pseudo-random seed. We propose to choose the seed(s), and to combine them. State-of-the-art for combining random seeds: [Nagarajan et al., 2015] combines several AIs [Gaudel et al., 2010] uses Nash methods for combining several opening books [Saint-Pierre and Teytaud, 2014] constructs several AIs from a single stochastic one and combines them by the BestSeed and Nash approaches 58 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Trick: present results with one white seed per column and one black seed per row Column player gets 1- M i,j M 1,1 M 1,2 ... ... M 1,K K random seeds for Black M 2,1 M 2,2 ... ... M 2,K Row player gets M i,j ... ... ... ... M i,j ... ... ... ... ... M K,1 M K,2 ... ... M K,K K random seed for White Figure: One black seed per row, one white seed per column. 59 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Propositions: Nash & BestSeed Nash Nash = combines rows (more robust; we will see later) BestSeed BestSeed = just pick up the best row / best column 60 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Better than squared matrices: rectangle methods Remark : for choosing a row, if #rows = #cols, then #rows is more critical than #cols; for a given budget, increase #rows and decrease #cols (same budget!) K K t K t x K t K K t Figure: Left: square matrix of a game; right: rectangles of a game ( K >> K t ). 61 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Does it work ? experiments on Domineering The opponent uses seeds which have never been used during the learning of the portfolio (cross-validation). Figure: Results for domineering, with the BestSeed (left) and the Nash (right) approach, against the baseline ( K ′ = 1) and the exploiter ( K ′ > 1; opponent who “learns” very well). K t = 900 in all experiments. BestSeed performs well against the original algorithm ( K ′ = 1), but poorly against the exploiter ( K ′ > 1). Nash outperforms the original algorithm both w.r.t K ′ = 1 (all cases) and K ′ > 1 (most cases). 62 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Beyond cross-validation: experiments with transfer in the game of Go Learning : BestSeed is applied to GnuGo, with MCTS and a budget of 400 simulations. Test : against “classical” GnuGo, i.e. the non-MCTS version of GnuGo. Opponent Performance of BestSeed Performance with randomized seed GnuGo-classical level 1 1. ( ± 0 ) .995 ( ± 0 ) GnuGo-classical level 2 1. ( ± 0 ) .995 ( ± 0 ) GnuGo-classical level 3 1. ( ± 0 ) .99 ( ± 0 ) GnuGo-classical level 4 1. ( ± 0 ) 1. ( ± 0 ) GnuGo-classical level 5 1. ( ± 0 ) 1. ( ± 0 ) GnuGo-classical level 6 1. ( ± 0 ) 1. ( ± 0 ) GnuGo-classical level 7 .73 ( ± .013 ) .061 ( ± .004 ) GnuGo-classical level 8 .73 ( ± .013 ) .106 ( ± .006 ) GnuGo-classical level 9 .73 ( ± .013 ) .095 ( ± .006 ) GnuGo-classical level 10 .73 ( ± .013 ) .07 ( ± .004 ) Table: Performance of “BestSeed” and “randomized seed” against “classical” GnuGo. Previous slide : we win against the AI which we have trained (but different seeds!). This slide : we improve the winning rate against another AI. 63 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing random seeds: Partial conclusion Conclusion: Seed optimization (NOT position specific) = can be seen as a simple and effective tool for building an opening book with no development effort, no human expertise, no storage of database. “Rectangle” provides significant improvements. The online computational overhead of the methods is negligible. The boosted AIs significantly outperform the baselines. BestSeed performs well, but can be overfitted ⇒ strength of Nash. Further work: The use of online bandit algorithms for dynamically choosing K / K t . Note: The BestSeed and the Nash algorithms are not new. The algorithm and analysis of rectangles is new. The analysis of the impact of seeds is new. The applications to Domineering, Atari-go and Breakthfrough are new. 64 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Two parts: Seeds matter: **choose** your seeds ! More tricky but worth the effort: position-specific seeds ! (towards a better asymptotic behavior of MCTS ?) 65 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Tsumego Tsumego (by Yoji Ojima, Zen’s author) Input: a Go position Question: is this situation a win for white ? Output: yes or no 66 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Tsumego Tsumego (by Yoji Ojima, Zen’s author) Input: a Go position Question: is this situation a win for white ? Output: yes or no Why so important? At the heart of many game algorithms In Go, Exptime complete [Robson, 1983] 66 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Classical algorithms Monte Carlo (MC) [Bruegmann, 1993, Cazenave, 2006, Cazenave and Borsboom, 2007] Monte Carlo Tree Search (MCTS) [Bouzy, 2004, Coulom, 2006] Nested MC [Cazenave, 2009] Voting scheme among MCTS [Gavin et al., ] 67 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Classical algorithms Monte Carlo (MC) [Bruegmann, 1993, Cazenave, 2006, Cazenave and Borsboom, 2007] Monte Carlo Tree Search (MCTS) [Bouzy, 2004, Coulom, 2006] Nested MC [Cazenave, 2009] Voting scheme among MCTS [Gavin et al., ] ⇒ here weighted voting scheme among MCTS 67 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Evaluation of the game value Algorithm 4 Evaluation of the game value. 1: Input current state s 2: Input a policy π B for Black, depending on a seed in N + 3: Input a policy π W for White, depending on a seed in N + 4: for i ∈ { 1 , . . . , K } do for j ∈ { 1 , . . . , K } do 5: M i , j ← outcome of the game starting in s with π B playing as Black with seed 6: b ( i ) and π W playing as White with seed w ( j ) end for 7: 8: end for 9: Compute weights p for Black and q for White for the matrix M (either BestSeed, Nash, or other) 10: Return p T Mq ◮ approximate value of the game M 68 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Classical case (MC/MCTS): unpaired Monte Carlo averaging Column player gets 1- M i,j K*K random seeds for White M 1,1 M 1,2 ... ... M 1,K ... ... w( K*K ) K random seeds for Black w(1) w(2) w(i) M 2,1 M 2,2 ... ... M 2,K Row player gets M i,j ... ... ... M i,j ... ... ... b(1) b(2) b(i) b( K*K ) ... ... ... ... ... M K,1 M K,2 ... ... M K,K K*K random seeds for Black K random seed for White Figure: Left: unpaired case (classical estimate by averaging); right: paired case: K seeds vs K seeds. 69 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Applied methods and setting Compared methods for approximating v ( s ) Three methods use K 2 indep. batches of M MCTS-simulations using matrix of seeds: Nash reweighting = Nash-value BestSeed reweighting = Intersection best row / best col Paired MC estimate = Average of the matrix 70 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Applied methods and setting Compared methods for approximating v ( s ) Three methods use K 2 indep. batches of M MCTS-simulations using matrix of seeds: Nash reweighting = Nash-value BestSeed reweighting = Intersection best row / best col Paired MC estimate = Average of the matrix One unpaired method : classical MC estimate (the average of K 2 random MCTS) Baseline : a single long MCTS (=state of the art !) → only one which is not K 2 -parallel 70 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Applied methods and setting Compared methods for approximating v ( s ) Three methods use K 2 indep. batches of M MCTS-simulations using matrix of seeds: Nash reweighting = Nash-value BestSeed reweighting = Intersection best row / best col Paired MC estimate = Average of the matrix One unpaired method : classical MC estimate (the average of K 2 random MCTS) Baseline : a single long MCTS (=state of the art !) → only one which is not K 2 -parallel Parameter setting: GnuGo-MCTS [Bayer et al., 2008] setting A : 1 000 simulations per move setting B : 80 000 simulations per move 70 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Average results over 50 Tsumego problems 0.9 0.9 Nash Paired 0.85 0.85 Best Unpaired 0.8 0.8 MCTS(1) 0.75 0.75 Performance Performance 0.7 0.7 0.65 0.65 Nash Paired 0.6 0.6 Best Unpaired 0.55 0.55 MCTS(1) 0.5 0.5 0 200 400 600 800 1000 0 200 400 600 800 1000 Submatrix Size (N 2 ) Submatrix Size (N 2 ) (a) setting A : 1 000 simulations per move. (b) setting B : 80 000 simulations per move. Figure: Average over 50 Tsumego problems. x-axis : #simulations, y-axis : %correct answers. MCTS ( 1 ) : one single MCTS run using all the budget. Setting A (small budget): MCTS ( 1 ) outperforms weighted average of 81 MCTS runs (but we are more parallel !) Setting B (large budget): we outperform MCTS and all others by far ⇒ consistent with the limited scalability of MCTS for huge number of sim. 71 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Partial conclusion Main conclusion: novel way of evaluating game values using Nash Equilibrium (theoretical validation & experiments on 50 Tsumego problems) Nash or BestSeed predictor requires far less simulations for finding accurate results + sometimes consistent whereas original MC is not ! 72 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Partial conclusion Main conclusion: novel way of evaluating game values using Nash Equilibrium (theoretical validation & experiments on 50 Tsumego problems) Nash or BestSeed predictor requires far less simulations for finding accurate results + sometimes consistent whereas original MC is not ! We outperformed average of MCTS runs sharing the budget a single MCTS using all the budget → For M large enough, our weighted averaging of 81 single MCTS runs with M simulations is better than a MCTS run with 81 M simulations :) 72 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Partial conclusion Main conclusion: novel way of evaluating game values using Nash Equilibrium (theoretical validation & experiments on 50 Tsumego problems) Nash or BestSeed predictor requires far less simulations for finding accurate results + sometimes consistent whereas original MC is not ! We outperformed average of MCTS runs sharing the budget a single MCTS using all the budget → For M large enough, our weighted averaging of 81 single MCTS runs with M simulations is better than a MCTS run with 81 M simulations :) Take-home messages We classify positions (“black wins” vs “white wins”). We use a WEIGHTED average of K 2 MCTS runs of M simulations. Our approach outperforms: all tested voting schemes among K 2 MCTS estimates of M simulations, and a pure MCTS of K 2 × M simulations, when M is large and K 2 = 81. 72 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Conclusion Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 73 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Conclusion A work on sparsity, at the core of ZSMG A parameter-free adversarial bandit, obtained by tuning (no details provided in this talk) + sparsity Applications of ZSMG: Nash + Sparsity → faster + more readable robust decision making 74 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Conclusion A work on sparsity, at the core of ZSMG A parameter-free adversarial bandit, obtained by tuning (no details provided in this talk) + sparsity Applications of ZSMG: Nash + Sparsity → faster + more readable robust decision making Random seeds = new MCTS variants ? validated as opening book learning (Go, Atari-Go, Domineering, Breakthrough, Draughts,Phantom-Go. . . ) position-specific seeds validated on Tsumego 74 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Motivation 1 Noisy Optimization 2 Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing Portfolio and noisy optimization 3 Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion Adversarial portfolio 4 Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 75 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) 76 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? 76 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? Adversarial portfolio: successful use of sparsity; parameter-free bandits ? 76 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? Adversarial portfolio: successful use of sparsity; parameter-free bandits ? MCTS and seeds: room for 5 ph.D. ... if there is funding for it :-) 76 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? Adversarial portfolio: successful use of sparsity; parameter-free bandits ? MCTS and seeds: room for 5 ph.D. ... if there is funding for it :-) Most works here → ROBUSTNESS by COMBINATION (robust to solvers, to models, to parameters, to seeds ...) 76 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Thanks for your attention ! Thanks to all the collaborators from Artelys, INRIA, CNRS, Univ. Paris-Saclay, Univ. Paris-Dauphine, Univ. du Littoral, NDHU ... 77 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references I Audibert, J.-Y. and Bubeck, S. (2009). Minimax policies for adversarial and stochastic bandits. In proceedings of the Annual Conference on Learning Theory (COLT) . Auer, P ., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. (1995). Gambling in a rigged casino: the adversarial multi-armed bandit problem. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science , pages 322–331. IEEE Computer Society Press, Los Alamitos, CA. Auger, A. (2005). Convergence results for the (1, λ )-sa-es using the theory of φ -irreducible markov chains. Theoretical Computer Science , 334(1):35–69. Baudiˇ s, P . and Poˇ s´ ık, P . (2014). Online black-box algorithm portfolios for continuous optimization. In Parallel Problem Solving from Nature–PPSN XIII , pages 40–49. Springer. 78 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references II Bayer, A., Bump, D., Daniel, E. B., Denholm, D., Dumonteil, J., Farneb¨ ack, G., Pogonyshev, P ., Traber, T., Urvoy, T., and Wallin, I. (2008). Gnu go 3.8 documentation. Technical report, Free Software Fundation. Billingsley, P . (1986). Probability and Measure . John Wiley and Sons. Bouzy, B. (2004). Associating shallow and selective global tree search with Monte Carlo for 9x9 Go. In 4rd Computer and Games Conference, Ramat-Gan . Brenner, J. and Cummings, L. (1972). The Hadamard maximum determinant problem. Amer. Math. Monthly , 79:626–630. Bruegmann, B. (1993). Monte-carlo Go (unpublished draft http://www.althofer.de/bruegmann-montecarlogo.pdf). 79 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references III Bubeck, S., Munos, R., and Stoltz, G. (2011). Pure exploration in finitely-armed and continuous-armed bandits. Theoretical Computer Science , 412(19):1832–1852. Cazenave, T. (2006). A phantom-go program. In van den Herik, H. J., Hsu, S.-C., Hsu, T.-S., and Donkers, H. H. L. M., editors, Proceedings of Advances in Computer Games , volume 4250 of Lecture Notes in Computer Science , pages 120–125. Springer. Cazenave, T. (2009). Nested monte-carlo search. In Boutilier, C., editor, IJCAI , pages 456–461. Cazenave, T. and Borsboom, J. (2007). Golois wins phantom go tournament. ICGA Journal , 30(3):165–166. 80 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references IV Coulom, R. (2006). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In P . Ciancarini and H. J. van den Herik, editors, Proceedings of the 5th International Conference on Computers and Games, Turin, Italy , pages 72–83. Cranley, R. and Patterson, T. (1976). Randomization of number theoretic methods for multiple integration. SIAM J. Numer. Anal. , 13(6):904,1914. Dupaˇ c, V. (1957). O Kiefer-Wolfowitzovˇ e aproximaˇ cn´ ı Methodˇ e. ˇ Casopis pro pˇ estov´ an´ ı matematiky , 082(1):47–75. Dvoretzky, A., Kiefer, J., and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Annals of Mathematical Statistics , 33:642–669. Fabian, V. (1967). Stochastic Approximation of Minima with Improved Asymptotic Speed. Annals of Mathematical statistics , 38:191–200. 81 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references V Gaudel, R., Hoock, J.-B., P´ erez, J., Sokolovska, N., and Teytaud, O. (2010). A Principled Method for Exploiting Opening Books. In International Conference on Computers and Games , pages 136–144, Kanazawa, Japon. Gavin, C., Stewart, S., and Drake, P . Result aggregation in root-parallelized computer go. Grigoriadis, M. D. and Khachiyan, L. G. (1995). A sublinear-time randomized approximation algorithm for matrix games. Operations Research Letters , 18(2):53–58. Hadamard, J. (1893). R´ esolution d’une question relative aux d´ eterminants. Bull. Sci. Math. , 17:240–246. Hammersley, J. and Handscomb, D. (1964). Monte carlo methods, methuen & co. Ltd., London , page 40. 82 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references VI Heidrich-Meisner, V. and Igel, C. (2009). Hoeffding and bernstein races for selecting policies in evolutionary direct policy search. In ICML ’09: Proceedings of the 26th Annual International Conference on Machine Learning , pages 401–408, New York, NY, USA. ACM. Jebalia, M. and Auger, A. (2008). On multiplicative noise models for stochastic search. In et a.l., G. R., editor, Conference on Parallel Problem Solving from Nature (PPSN X) , volume 5199, pages 52–61, Berlin, Heidelberg. Springer Verlag. Liu, J., Saint-Pierre, D. L., Teytaud, O., et al. (2014). A mathematically derived number of resamplings for noisy optimization. In Genetic and Evolutionary Computation Conference (GECCO 2014) . Mascagni, M. and Chi, H. (2004). On the scrambled halton sequence. Monte-Carlo Methods Appl. , 10(3):435–442. 83 / 77
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references VII Mnih, V., Szepesv´ ari, C., and Audibert, J.-Y. (2008). Empirical Bernstein stopping. In ICML ’08: Proceedings of the 25th international conference on Machine learning , pages 672–679, New York, NY, USA. ACM. Nagarajan, V., Marcolino, L. S., and Tambe, M. (2015). Every team deserves a second chance: Identifying when things go wrong (student abstract version). In 29th Conference on Artificial Intelligence (AAAI 2015), Texas, USA . Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo Methods . Rechenberg, I. (1973). Evolutionstrategie: Optimierung Technischer Systeme nach Prinzipien des Biologischen Evolution . Fromman-Holzboog Verlag, Stuttgart. Robson, J. M. (1983). The complexity of go. In IFIP Congress , pages 413–417. 84 / 77
Recommend
More recommend