online linear optimization and adaptive routing
play

Online linear optimization and adaptive routing Baruch Awerbuch, - PowerPoint PPT Presentation

Online linear optimization and adaptive routing Baruch Awerbuch, Robert Kleinberg Motivation Overlay network routing Send a packet from source to target using the route with minimum delay The total route delay is revealed Graph


  1. Online linear optimization and adaptive routing Baruch Awerbuch, Robert Kleinberg

  2. Motivation ● Overlay network routing – Send a packet from source to target using the route with minimum delay ● The total route delay is revealed ● Graph example 12 1 3 12 r s 5 5 10 1

  3. Using previous algorithms ● We can use EXP3. Each route is a arm. Since we have n! routes, our regret will be O ( √ ( K G max ln K ))→ O ( √ ( n! ln n! )) ● We have seen online shortest paths with (full information) E [ cost ]≤( 1 +ϵ) mincost T + O ( mn log n /ϵ)

  4. Problem definition ● G=(V,E) – Directed graph ● For each j = 1, …, T the adaptive adversary select cost for each edge c j : E →[ 0,1 ] ● The algorithm select a path of length ≤ H ● Receive cost of the entire path ● Goal to minimize the difference between the algorithm's expected total cost and the cost of the best single path from source to target

  5. Regret 1 / 3 T 2 / 3 ) 2 ( mH log Δ log mHT ) O ( H

  6. Pre-processing ● We will transform the graph G to a leveled G =( ̃ ̃ V , ̃ directed acyclic graph E ) ● Start by calculating G x {0, 1, …, H} – Vertex set V x {0, 1, …, H} – e i from (u, i - 1) to (v, i) for every e=(u, v) in E ● The graph is obtained by: ̃ G – Deleting paths that doesn't reach to r

  7. Main idea ● We can traverse the graph by querying BEX for probabilities on the outgoing edges until we reach r ● To do so we need to feed BEX with information on all experts ● We will run in phases, at each phase we will estimate the cost for all experts. At the end of each phase we will update BEX ● We will feed BEX with the total path cost

  8. Sampling experts ● We can sample the experts according to the distribution BEX returns (according to the previous phases costs) ● The problem – We might ignore some edges that might be better at next phases ● We will add some exploration steps at each phase

  9. Exploration ● Will occur with probability δ ● Choose an edge e=(u,v) uniformly at random ● Construct a path by joining prefix(u), e and suffix(v)

  10. Suffix ● Suffix(v) will return the distribution on s – v paths ● Implementation – Choose edge by BEX probabilities, traverse the edge, repeat until r is reached ● Why can't it be random? 1000 1 1 1 v 2 1 10 r

  11. Prefix ● Prefix(v) – Will return the distribution on s - v paths ● Let suffix(u | v) be the distribution on u – v paths ● Obtained by sampling from suffix(u) conditional to the event that the path passes through v.

  12. Prefix ● Sample from suffix(s | v) with probability ( 1 −δ) Pr ( v ∈ suffix ( s ))/ P ϕ ( v ) ● For all e = (q,u) from , with probability ̃ E (δ/ ̃ m ) Pr ( v ∈ suffix ( u ))/ P ϕ ( v ) sample from suffix(u | v) prepend e and then prepend a sample from prefix(q) ● Where P Φ (v) is the probability v is contained in the suffix of a path in phase Φ

  13. Updating costs ● Phase length τ= ⌈ 2 mH log ( mH T )/δ ⌉ ● At each phase we will sum the costs for each edge only if the edge wasn't part of the path chosen by prefix ● The reason for that is that we cannot control the probability those edges came from

  14. Updating costs ● At the end of each phase ∀ e ∈ ̃ E , μ ϕ ( e )← E [ ∑ χ j ( e )] j ∈τ ϕ c ϕ ( e )←( ∑ χ j ( e ) c j (π j ))/μ ϕ ( e ) ̃ j ∈τ ϕ Where ϕ= 1,... ,. ⌈ T /τ⌉ j =τ(ϕ− 1 )+ 1, τ(ϕ− 1 )+ 2,... , τϕ

  15. Algorithm analysis ● Let T − ( v )= ∑ E [ c j ( prefix ( v ))] C j = 1 T + ( v )= ∑ E [ c j ( suffix ( v ))] C j = 1 T OPT ( v )= min paths π : v → r ∑ c j (π) j = 1

  16. Algorithm analysis ● We know that for BEX t K t ∑ ∑ p j ( i ) c j ( i )≤ ∑ c j ( k )+ O (ϵ t + log K /ϵ) M j = 1 i = 1 j = 1 ● Let p ϕ be the probability distribution supplied by BEX(v) during phase ϕ t t ∑ ∑ c ϕ ( e )≤ ∑ p ϕ ( e ) ̃ c ϕ ( e 0 )+ O (ϵ H t + H log Δ/ϵ) ̃ ϕ= 1 e ∈Δ( v ) ϕ= 1

  17. Algorithm analysis ● We used the fact that cost of a phase M is smaller than 3H with high probability. By Chernoff bound τ= 2mHlog ( mHT ) δ μ ϕ >δ τ mH = 2log ( mHT ) − 2 / 32log ( mH T ) ≤ 1 Pr ( ∑ χ j ≥ 3 ∗ 2log ( mHT ))≤ e mHT j ∈τ ϕ

  18. Algorithm analysis ● Now by applying union bound over all phases we get that this low probability event contributes at most HT / (mHT) < 1. So we will ignore this event

  19. Algorithm analysis ● Expanding ̃ c ϕ t ( Eq.12 ) ∑ e ∈Δ( v ) ∑ ∑ p ϕ ( e )χ j ( e ) c j (π j )/μ ϕ ( e ) ϕ= 1 j ∈τ ϕ t χ j ( e 0 ) c j (π j )/μ ϕ ( e 0 )+ O (ϵ Ht + H ≤ ∑ ∑ ϵ log Δ) ϕ= 1 j ∈τ ϕ

  20. Algorithm analysis ● Claim 3.2. Pr (π⊂π j ∣χ j ( e )= 1 )= Pr ( prefix ( v )=π) π : s → v

  21. Algorithm analysis ● Proof of claim 3.2 + 0 ∨ e ∈π j χ j ( e )= 1 → e ∈π j 0 )= Pr ( prefix ( v )=π) Pr (π⊆π j ∣ e ∈π j + )= Pr ( prefix ( v )=π) Pr (π⊆π j ∣ e ∈π j ● The first claim is by definition, let's prove the second claim

  22. Algorithm analysis ● e is sampled independently from the path preceding v, so + )= Pr (π∈π j ∣ v ∈π j + ) Pr (π⊆π j ∣ e ∈π j + ) Pr (π⊆π j ∣ v ∈π j + )= Pr (π⊆π j ∩ v ∈π j + ) Pr ( v ∈π j =( 1 −δ) Pr ( v ∈ suffix ( s )) Pr (π= suffix ( s ∣ v )) δ + ∑ m Pr ( v ∈ suffix ( u )) Pr (π= prefix ( q )∪{ e }∪ suffix ( u ∣ v )) ̃ e =( q ,u )∈ ̃ E + ) Pr (π= prefix ( v )) = Pr ( v ∈π j

  23. Algorithm analysis ● Claim 3.3. If e =(v, w) then E [χ j ( e ) c j (π j )]=(μ( e )/τ)( A j ( v )+ B j ( w )+ c j ( e )) A j ( v )= E [ c j ( prefix ( v ))] B j ( w )= E [ c j ( suffix ( w ))] ● Follows from claim 3.2 that the portion of the path preceding e is distributed by prefix(v)

  24. Algorithm analysis ● Taking the expectation of eq.12 The left side will become t ∑ e ∈Δ( v ) ∑ ∑ p ϕ ( e )( A j ( v )+ B j ( w )+ c j ( e )) ϕ= 1 j ∈τ ϕ T = 1 τ ∑ ∑ p ϕ ( e )( A j ( v )+ B j ( w )+ c j ( e )) j = 1 e ∈Δ( v ) ● The right side will become T 1 τ ∑ ( A j ( v )+ B j ( w 0 )+ c j ( e 0 )) j = 1

  25. Algorithm analysis ● After removing A j (v) from both sides and notice that ∑ p ϕ ( e )( B j ( w )+ c j ( e ))= E [ c j ( suffix ( v ))] e ∈Δ( v ) ● So the left side will become T 1 + ( v )/τ τ ∑ E [ c j ( suffix ( v ))]= c j = 1

  26. Algorithm analysis ● The right side will become T 1 + ( w 0 )/τ+ O (ϵ Ht + H τ ∑ E [ c j ( suffix ( v ))]+ c ϵ log Δ) j = 1 ● Thus we have derived the local performance guarantee (Eq.13) T + ( v )≤ c + ( w 0 )+ ∑ c j ( e 0 )+ O (ϵ HT +τ ϵ H log Δ) c j = 1

  27. Global performance guarantee ● Claim 3.4 + ( v )≤ OPT ( v )+ O (ϵ HT +τ ϵ H log Δ) h ( v ) c ● To prove we can use the following observation T OPT ( v )= min e 0 =( v ,w 0 ) { ∑ c j ( e 0 )+ OPT ( w 0 )} j = 1

  28. Global performance guarantee ● Proof – By induction on h(v) and by using the local performance guarantee ● Lets mark F = O (ϵ Ht +τ H ϵ log Δ) ● Now rewrite the claim and eq.13 + ( v )≤ OPT ( v )+ F h ( v ) c T + ( v )≤ c + ( w 0 )+ ∑ c j ( e 0 )+ F c j = 1

  29. Global performance guarantee ● h(v)=1 T + ( v )≤ OPT ( v )+ F = ∑ c c j ( e 0 )+ OPT ( r )+ F : ∀ e 0 =( v ,r ) j = 1 T + ( v )≤ ∑ c j ( e 0 )+ F : ∀ e 0 =( v ,r ) c j = 1 It's true by the local performance guarantee

  30. Global performance guarantee ● h(v)=k+1 T + ( v )≤ c + ( v k )+ ∑ c j ( e k + 1 )+ F c j = 1 T ≤ ∑ c j ( e k + 1 )+ OPT ( v k )+ kF + F j = 1 = OPT ( v k + 1 )+( k + 1 ) F

  31. Regret ● Theorem 3.5. The algorithm suffers regret 2 ( mH log Δ log mHT ) 1 / 3 T 2 / 3 ) O ( H ● The exploration step contributes δ TH + ( s )− OPT ( s ) ● The exploitation contributes c ● Also τ= 2 mH log ( mH T )/δ ● Substituting in claim 3.4 we get total exploitation cost + ( s )− OPT ( s )= O (ϵ T + 2mHlog Δ log ( mhT ) 2 ) H c ϵδ

  32. Regret Regret ≤ O (δ T +ϵ T + 2mHlog Δ log ( mhT ) 2 ) H ϵδ ● We can assign 1 / 3 T − 1 / 3 ϵ=δ=( 2mH log Δ log ( mhT )) And we will get the desired regret 2 ( mH log Δ log mHT ) 1 / 3 T 2 / 3 ) O ( H

Recommend


More recommend