Ev Evolutionary Computation plus Dynamic Pr Programming for the Bi-Ob Objec ective e Travel elling Thie Th ief f Proble lem Junhua Wu, Sergey Polyakovskiy, Markus Wagner, Frank Neumann Frank.Neumann@Adelaide.edu.au Project page: https://cs.adelaide.edu.au/~optlog/research/ttp.php Or google “travelling thief Adelaide” Tuesday, July 17, 10:40-12:20, Conference Room D (3F)
The Travelling Thief Problem (TTP) Composed of the merging of the Traveling Salesman Problem and the Knapsack Problem 1 7 2 3 0 5 6 4 8
The Travelling Thief Problem (TTP) Composed of the merging of the Traveling Salesman Problem and the Knapsack Problem 1 7 2 3 0 5 6 4 8
The Travelling Thief Problem (TTP) Composed of the merging of the Traveling Salesman Problem and the Knapsack Problem 1 7 2 3 8 10 16 1 6 2 0 Knapsack 12 17 5 6 11 4 4 8 13 15 5 7 9 3 14
The Travelling Thief Problem (TTP) Composed of the merging of the Traveling Salesman Problem and the Knapsack Problem 1 7 2 3 8 10 16 1 6 2 0 Knapsack 12 17 5 7 1 6 11 14 4 2 4 8 13 5 15 5 7 9 3 14
THE TRAVELING THIEF PROBLEM (TTP) Goal: Visit each city exactly once, maximising the total profit 𝑄 such that the total weight does not exceed the knapsack capacity 𝑋 , where 𝑄 is defined as: ( - 𝑄 = $ 𝑞 % 𝑦 % − 𝑆 $ 𝑢 %,%0' %&' %&' where 𝑦 % = 1 0 depending on whether the item 𝑗 is picked 1 or not 0 , and 𝑢 %,4 is defined as: 𝑒(Π % , Π 4 ) 𝑢 %,4 = 𝑤 (:; − 𝑤 (%- 𝑤 (:; − 𝑋 < = 𝑋 where Π % is the city at tour position 𝑗 in tour Π , and 𝑋 < = is the current weight of the knapsack at city Π % .
The Bi-Objective TTP a natural extension: maximise the reward for a given weight of collected items, or determine the least weight subject to bounds imposed on the reward • Objective one: profit P as defined before • Objective two: total accumulated weight
Packing-While-Travelling (PWT) • …
ρ 1 ->(z 5 , w 5 ) Total Reward ρ 1 ->(z 4 , w 4 ) ρ 1 ->(z 3 , w 3 ) ρ 1 ->(z 2 , w 2 ) ρ 1 ->(z 1 , w 1 , ) π 1 Weight
(the “natural” approach would be the following) (π 4 , ρ 4 ) (z 5 , w 5 ) (π 2 , ρ 2 ) Total Reward (z 4 , w 4 ) (z 3 , w 3 ) (π 5 , ρ 5 ) (z 2 , w 2 ) (π 3 , ρ 3 ) (z 1 , w 1 ) (π 1 , ρ 1 ) Weight
Total Reward -5000 5000 eil76_n75_uncorr_01.ttp, inver over • TSP solvers; CONCORDE (CON), ACO, LKH and LKH2 • Many single-objective TTP heuristics take a good Solving the Bi-Obj. TTP 0 Max reward:4791.466 Corresponding tour length:586 0 here? TSP tour as a starting point. What does this mean 1000 Weight 2000 3000 4000 Total Reward 10000 -2000 2000 4000 6000 8000 0 ACO_Bounded01 CON_Bounded01 INV_Bounded01 LKH_Bounded01 LKH2_Bounded01 ACO_Bounded06 CON_Bounded06 INV_Bounded06 LKH_Bounded06 LKH2_Bounded06 ACO_SimilarWeights01 CON_SimilarWeights01 INV_SimilarWeights01 LKH_SimilarWeights01 LKH2_SimilarWeights01 ACO_SimilarWeights06 CON_SimilarWeights06 INV_SimilarWeights06 LKH_SimilarWeights06 LKH2_SimilarWeights06 ACO_Uncorrelated01 CON_Uncorrelated01 INV_Uncorrelated01 LKH_Uncorrelated01 LKH2_Uncorrelated01 ACO_Uncorrelated06 CON_Uncorrelated06 INV_Uncorrelated06 LKH_Uncorrelated06 LKH2_Uncorrelated06
Indicators Def 3.2: Given q different DP fronts, let 𝜚 denote a set of possible unique solution points derived by 𝜐 1 .. 𝜐 q . Then 𝜕 is a Pareto front formed by the points of 𝜚 and 𝜕 is named as the surface of 𝜚 . Given a tour 𝜐 𝜌 , and its corresponding solution set T 𝜌 : • Surface Contribution: number of objective vectors contributed by T 𝜌 • Hypervolume: volume covered by T 𝜌 w.r.t (0,C) • Loss of Contribution:
Parent Selection Mechanisms • Rank-Based Selection (RBS), Fitness-Proportionate Selection (FPS), Tournament Selection (TS), Arbitrary Selection (AS), Uniformly-at-Random Selection (UAR) Crossover and Mutation Operators • TSP-only: multi-point crossover, 2-opt mutation, jump
Experimental Study • 2 indicators X 8 parent selection strategies • TTP instances from the classes eil51, eil76, eil101; three knapsack types Assessment • 30 repetitions, Welch’s t-test with UAR as a baseline (like the Student's t-test, but more reliable when the two samples have unequal variances and unequal sample sizes)
30 Loss Hypervolume 25 20 Bounded 15 SimilarWeights 10 Uncorrelated 5 total reward 0 AS-BST AS-EXT FPS RBS-EXP RBS-HAR RBS-IQ TS 30 hypervolume 25 20 Bounded 15 SimilarWeights animation with “appear” 10 Uncorrelated 5 0 AS-BST AS-EXT FPS RBS-EXP RBS-HAR RBS-IQ TS 25 Loss Surface Contribution 20 Bounded 15 SimilarWeights 10 Uncorrelated 5 total reward 0 25 AS-BST AS-EXT FPS RBS-EXP RBS-HAR RBS-IQ TS hypervolume 20 Bounded 15 SimilarWeights 10 Uncorrelated Note: bars are sums of 5 log-scaled p-values 0 AS-BST AS-EXT FPS RBS-EXP RBS-HAR RBS-IQ TS
Comparison of bi-obj. approaches with single- objective MA2B MA2B by El Yafrani and Ahiod [GECCO’16] Fitness-Proportionate Selection Loss of Hypervolume Loss of Surface Contribution
Summary • Bi-Objective TTP: profit vs. weight • Dynamic programming provides provably optimal trade-off fronts for a given tour • Indicator-based EA with a population of tours: with ”loss of surface contribution” and “loss of hypervolume” • Best bi-objective approaches beat single-objective state-of-the-art
Recommend
More recommend