DM841 D ISCRETE O PTIMIZATION Part 2 – Heuristics Experimental Analysis Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark
Outline Inferential Statistics Sequential Testing Outline Algorithm Selection 1. Inferential Statistics Statistical Tests Experimental Designs Applications to Our Scenarios 2. Race: Sequential Testing 3. Algorithm Selection 2
Outline Inferential Statistics Sequential Testing Outline Algorithm Selection 1. Inferential Statistics Statistical Tests Experimental Designs Applications to Our Scenarios 2. Race: Sequential Testing 3. Algorithm Selection 3
Outline Inferential Statistics Sequential Testing Outline Algorithm Selection 1. Inferential Statistics Statistical Tests Experimental Designs Applications to Our Scenarios 2. Race: Sequential Testing 3. Algorithm Selection 4
Outline Inferential Statistics Sequential Testing Inferential Statistics Algorithm Selection ◮ We work with samples (instances, solution quality) ◮ But we want sound conclusions: generalization over a given population (all runs, all possible instances) ◮ Thus we need statistical inference Random Sample Population Inference X n P ( x , θ ) Statistical Estimator � θ Parameter θ Since the analysis is based on finite-sized sampled data, statements like “the cost of solutions returned by algorithm A is smaller than that of algorithm B ” must be completed by “at a level of significance of 5 % ”. 10
Outline Inferential Statistics Sequential Testing A Motivating Example Algorithm Selection ◮ There is a competition and two stochastic algorithms A 1 and A 2 are submitted. ◮ We run both algorithms once on n instances. On each instance either A 1 wins ( + ) or A 2 wins ( − ) or they make a tie ( = ). Questions: 1. If we have only 10 instances and algorithm A 1 wins 7 times how confident are we in claiming that algorithm A 1 is the best? 2. How many instances and how many wins should we observe to gain a confidence of 95% that the algorithm A 1 is the best? 11
Outline Inferential Statistics Sequential Testing A Motivating Example Algorithm Selection ◮ p : probability that A 1 wins on each instance (+) ◮ n : number of runs without ties ◮ Y : number of wins of algorithm A 1 If each run is indepenedent and consitent: � n � p y ( 1 − p ) n − y Y ∼ B ( n , p ) : Pr [ Y = y ] = y Binomial Distribution: Trials = 30, Probability of success = 0.5 ● ● ● 0.12 ● ● Probability Mass 0.08 ● ● ● ● 0.04 ● ● ● ● 0.00 ● ● ● ● ● ● 10 15 20 Number of Successes 12
Outline Inferential Statistics Sequential Testing Algorithm Selection 1 If we have only 10 instances and algorithm A 1 wins 7 times how confident are we in claiming that algorithm A 1 is the best? Under these conditions, we can check how unlikely the situation is if it was p (+) ≤ p ( − ) . If p (+) = 0 . 5 (ie, p (+) = p ( − ) ) then the chance that algorithm A 1 wins 7 or more times out of 10 is 17 . 2 % : quite high! Binomial distribution: Trials = 30 Probability of success 0.5 0.25 0.20 0.15 Pr[Y=y] 0.10 0.05 0.00 0 2 4 6 8 10 number of successes y 13
Outline Inferential Statistics Sequential Testing Algorithm Selection 2 How many instances and how many wins should we observe to gain a confidence of 95% that the algorithm A 1 is the best? To answer this question, we compute the 95 % -quantile, i.e. , y : Pr [ Y ≥ y ] < 0 . 05 with p = 0 . 5 at different values of n : 10 11 12 13 14 15 16 17 18 19 20 n y 9 9 10 10 11 12 12 13 13 14 15 This is an application example of sign test, a special case of binomial test in which p = 0 . 5 14
Outline Inferential Statistics Sequential Testing Statistical tests Algorithm Selection General procedure: ◮ Assume that data are consistent with a null hypothesis H 0 (e.g., sample data are drawn from distributions with the same mean value). ◮ Use a statistical test to compute how likely this is to be true, given the data collected. This “likely” is quantified as the p-value. ◮ Do not reject H 0 if the p-value is larger than an user defined threshold called level of significance α . ◮ Alternatively, (p-value < α ), H 0 is rejected in favor of an alternative hypothesis, H 1 , at a level of significance of α . 15
Outline Inferential Statistics Sequential Testing Inferential Statistics Algorithm Selection Two kinds of errors may be committed when testing hypothesis: α = P ( type I error ) = P ( reject H 0 | H 0 is true ) β = P ( type II error ) = P ( fail to reject H 0 | H 0 is false ) General rule: 1. specify the type I error or level of significance α 2. seek the test with a suitable large statistical power, i.e., 1 − β = P ( reject H 0 | H 0 is false ) 16
Outline Inferential Statistics Sequential Testing Algorithm Selection Theorem: Central Limit Theorem If X n is a random sample from an arbitrary distribution with mean µ and X n is asymptotically normally distributed, i.e. , variance σ then the average ¯ X n − µ ¯ X n ≈ N ( µ, σ 2 ¯ σ/ √ n ≈ N ( 0 , 1 ) n ) or z = ◮ Consequences: ◮ allows inference from a sample ◮ allows to model errors in measurements: X = µ + ǫ ◮ Issues: ◮ n should be enough large ◮ µ and σ must be known 17
Outline Inferential Statistics Sequential Testing Algorithm Selection Weibull distribution 0.6 dweibull(x, shape = 1.4) ¯ X − µ z = σ/ √ n 0.4 0.2 0.0 0 10 20 30 40 x Samples of size 1, 5, 15, 50 repeated 100 times n=1 n=5 n=15 n=50 0.4 0.6 0.4 0.4 0.5 0.3 0.3 0.3 0.4 Density Density Density Density 0.2 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0.1 18 0.0 0.0 0.0 0.0
Outline Inferential Statistics Sequential Testing Hypothesis Testing and Confidence Intervals Algorithm Selection A test of hypothesis determines how likely a sampled estimate ˆ θ is to occur under some assumptions on the parameter θ of the population. µ � � σ σ ¯ √ n ≤ ¯ X 1 √ n Pr µ − z 1 X ≤ µ + z 2 = 1 − α ¯ X 2 ¯ X 3 A confidence interval contains all those values that a parameter θ is likely to assume with probability 1 − α : Pr (ˆ θ 1 < θ < ˆ θ 2 ) = 1 − α µ � � σ σ ¯ ¯ √ n ≤ µ ≤ ¯ X 1 Pr X − z 1 X + z 2 √ n = 1 − α ¯ X 2 ¯ X 3 19
Outline Inferential Statistics Statistical Tests Sequential Testing Algorithm Selection The Procedure of Test of Hypothesis 1. Specify the parameter θ and the test hypothesis, � H 0 : θ = 0 θ = µ 1 − µ 2 H 1 : θ � = 0 µ 1 µ 2 2. Obtain P ( θ | θ = 0 ) , the null distribution θ of θ 3. Compare ˆ θ with the α/ 2-quantiles (for two-sided tests) of P ( θ | θ = 0 ) and reject or not H 0 according to whether ˆ θ is larger or smaller than this value. 20
Outline Inferential Statistics Statistical Tests Sequential Testing Algorithm Selection The Confidence Intervals Procedure N ( µ 1 , σ ) N ( µ 2 , σ ) 1. Specify the parameter θ and the test hypothesis, � H 0 : θ = 0 µ 1 µ 2 θ = µ 1 − µ 2 ( ¯ ( ¯ X 1 , S X 1 ) X 2 , S X 2 ) H 1 : θ � = 0 θ 2. Obtain P ( θ, θ = 0 ) , the null distribution of θ in correspondence of the observed estimate ˆ θ of the sample X 3. Determine (ˆ θ − , ˆ θ + ) such that θ − ≤ θ ≤ ˆ Pr { ˆ θ + } = 1 − α . 4. Do not reject H 0 if θ = 0 falls inside the interval (ˆ θ − , ˆ θ + ) . Otherwise � θ reject H 0 . � θ θ = 0 21
Outline Inferential Statistics Statistical Tests Sequential Testing Algorithm Selection The Confidence Intervals Procedure P ( θ 1 ) P ( θ 2 ) 1. Specify the parameter θ and the test hypothesis, � H 0 : θ = 0 θ = µ 1 − µ 2 H 1 : θ � = 0 2. Obtain P ( θ, θ = 0 ) , the null � � ( ¯ X 1 − ¯ X 2 ) − µ 1 − µ 2 T = distribution of θ in correspondence of � SX 1 − SX 2 the observed estimate ˆ r θ of the sample T ˜ Student’s t Distribution X θ ∗ = ¯ 1 − ¯ X ∗ X ∗ 2 3. Determine (ˆ θ − , ˆ θ + ) such that θ − ≤ θ ≤ ˆ Pr { ˆ θ + } = 1 − α . 4. Do not reject H 0 if θ = 0 falls inside the interval (ˆ θ − , ˆ � θ + ) . Otherwise θ reject H 0 . � θ θ = 0 21
Outline Inferential Statistics Sequential Testing Kolmogorov-Smirnov Tests Algorithm Selection The test compares empirical cumulative distribution functions. 1.0 0.8 0.6 F(x) 0.4 F(x) 2 0.2 F(x) 1 0.0 25 30 35 40 45 x It uses maximal difference between the two curves, sup x | F 1 ( x ) − F 2 ( x ) | , and assesses how likely this value is under the null hypothesis that the two curves come from the same data The test can be used as a two-samples or single-sample test (in this case to test against theoretical distributions: goodness of fit) The test can be done in R with ks.test 22
Outline Inferential Statistics Sequential Testing Parametric vs Nonparametric Algorithm Selection Parametric assumptions: Nonparametric assumptions: ◮ independence ◮ independence ◮ homoschedasticity ◮ homoschedasticity ◮ normality N ( µ, σ ) P ( θ ) ◮ Rank based tests ◮ Permutation tests ◮ Exact ◮ Conditional Monte Carlo 23
Outline Inferential Statistics Sequential Testing Outline Algorithm Selection 1. Inferential Statistics Statistical Tests Experimental Designs Applications to Our Scenarios 2. Race: Sequential Testing 3. Algorithm Selection 24
Recommend
More recommend