Renewal Approximation in the Online Increasing Subsequence Problem Alexander Gnedin, Amirlan Seksenbayev (Queen Mary, University of London)
Ulam’s problem 3 1 6 7 2 5 4 Ulam 1961: What is the expected length of the longest increasing subsequence of a random permutation of t integers? Instead of permutation, one can consider the setting of i.i.d. random marks X 1 , . . . , X t sampled from a continuous distribution (let it be uniform-[0 , 1])
Hammersley 1972: Suppose the marks arrive by a Poisson process on [0 , t ], so the sample size is random with Poisson( t )-distribution. A increasing subsequence ( x 1 , s 2 ) , . . . , ( x k , s k ) of marks/arrival times is a chain in two dimensions: x 1 < · · · < x k , s 1 < · · · < s k
Hammersley: since only the area of rectangle matters, the maximum length M ( t ) satisfies M (( a + b ) 2 ) ≥ M ( a 2 ) + M ( b 2 ) , hence by subadditivity √ M ( t ) ∼ c t , t → ∞ (in probability and in the mean).
Logan and Shepp 1977, Vershik and Kerov 1977: √ E M ( t ) ∼ 2 t Baik, Deift and Johansson 1999: M ( t ) − 2 √ t d → Tracy − Widom distribution . t 1 / 6 D. Romik 2014: The Surprising Mathematics of Longest Increasing Subsequences.
The online selection problem Samuels and Steele 1981: The marks are revealed to the observer one-by-one as they arrive. Each time a mark is observed, it can be selected or rejected, with decision becoming immediately final. What is the maximum expected length, v ( t ) , of increasing subsequence which can be selected by a nonanticipating online strategy?
Subadditivity yields √ v ( t ) ∼ c t but gives no clue about the constant (of course, c ≤ 2). A selection strategy can be identified with a sequence of stopping times embedded in the Poisson process, such that the corresponding marks increase. For instance, the greedy strategy, selecting every consecutive record (i.e. a mark bigger than all seen so far), yields a sequence of expected length � t 1 − e − s ds ∼ log t , t → ∞ . s 0 This is too far from optimality!
The principal asymptotics Samuels and Steele 1981: √ v ( t ) ∼ 2 t , t → ∞ , achieved by the strategy with constant acceptance window � 2 0 < x − y < t , where ( x , s ) ∈ [0 , 1] × [0 , t ] is the current arrival, and y the last mark selected before time s . Comparing with the offline asympotics 2 √ t in Ulam’s problem, the √ factor 2 quantifies the advantage of a prophet over nonclairvoyant decision maker selecting in the real time.
The optimality equation The maximal expected length satisfies the dynamic programming equation � 1 v ′ ( t ) = ( v ( t (1 − x ) + 1 − v ( t )) + dx , v (0) = 0 . 0 Under the optimal strategy ( x , s ) is accepted iff 0 < x − y 1 − x < ϕ ∗ (( t − s )(1 − x )) where y is the last selection and ϕ ∗ ( t ) is the solution to v ( t (1 − x )) + 1 − v ( t ) = 0 (for t > v ← (1) = 1 . 345 . . . ). A strategy of this kind with some control function ϕ defining a variable acceptance window will be called self-similar .
The tightest known bounds √ √ √ 2 t − log(1 + 2 t ) + c 0 < v ( t ) < 2 t The upper bound: Baryshnikov and G 2000 by comparing with the bin-packing problem the expected number of choices → max subject to the ((mean value!) constraint that the expectation of the sum of selected marks ≤ 1 . � In this problem the strategy choosing every x < 2 / t is exactly optimal. The lower bound: Bruss and Delbaen 2001, using concavity of v ( t ) and the optimality equation.
The asymptotic expansion Let L ϕ ( t ) be the length of selected subsequence under the strategy with control function ϕ , in particular v ( t ) = E L ϕ ∗ ( t ). Theorem . The expected length under the optimal strategy is √ √ 2 t − 1 2 12 log t + c ∗ + 144 √ t + O ( t − 1 ) v ( t ) ∼ and the variance is √ 2 t + 1 72 log t + c 1 + O ( t − 1 / 2 log t ) . Var ( L ϕ ∗ )( t ) = 3 The optimal strategy is self-similar with � 2 t − 1 3 t + O ( t − 3 / 2 ) . ϕ ∗ ( t ) ∼ Constants c ∗ , c 1 are unknown.
Theorem For every self-similar selection strategy with � 2 t + O ( t − 1 ) ϕ ( t ) = the expected length of increasing subsequence is within O (1) from the optimum, and the CLT holds √ √ 3 L ϕ ( t ) − 2 t d → N (0 , 1) . (2 t ) 1 / 4 Bruss and Delbaen 2004, Arlotto et al 2015 proved the CLT for the optimal strategy using concavity of v ( t ) and martingale methods. Our approach relies on a renewal approximation to the ‘remaining area process’.
Linearisation With z = √ t as the size parameter and a change of variables, the equation for expected length under self-similar strategy becomes � 1 u ′ ( z ) = 4 ( u ( z − y ) + 1 − u ( z )) + (1 − y / z ) dy . 0 This is a special case of the renewal-type equation � θ ( z ) u ′ r ,θ ( z ) = 4 ( u r ,θ ( z − y ) + r ( z ) − u r ,θ ( z ))(1 − y / z ) dy 0 with given reward function r ( z ) and control function 0 < θ ( z ) ≤ z related to a self-similar strategy via � 2 � 1 − θ ( z ) ϕ ( z 2 ) = 1 − . z
The admissible rectangle Change of variables: � the last so far selection ( y , s ) → z = ( t − s )(1 − y )
Piecewise deterministic Markov process For given control function 0 < θ ( z ) ≤ z , a PDMP process Z on [0 , ∞ ) is defined by (i) decreases with unit speed until absorption at 0, (ii) jumps at probability rate 4 λ ( z ), where λ ( z ) := θ ( z ) − θ 2 ( z ) , 2 z (iii) if jumps, then from z to z − y , with y having density (1 − y / z ) /λ ( z ) for y ∈ [0 , θ ( z )]. The number of jumps N θ ( z ) of Z starting from z = √ t is equal to L ϕ ( t ), the length of increasing subsequence under a self-similar strategy.
Asymptotic version of de Bruijn’s method for DE’s The operator � θ ( z ) I θ, r g ( z ) := 4 ( g ( z − y ) + r ( z ) − g ( z )) + (1 − y / z ) dy 0 has shift and monotonicity properties that imply Lemma If for large enough z, (a) g ′ ( z ) > I θ, r g ( z ) then lim sup z →∞ ( u θ, r ( z ) − g ( z )) < ∞ , (b) g ′ ( z ) < I θ, r g ( z ) then lim inf z →∞ ( u θ, r ( z ) − g ( z )) > −∞ . Example For g ( z ) = α z , in the optimality equation, (a) holds for √ √ √ α > 2, and (b) holds for α < 2, whence u ( z ) ∼ 2 z . Iterating twice , √ 2 z − 1 u ( z ) ∼ z → ∞ . 6 log z + O (1) , But the method does not capture the O (1)-remainder.
Let U ( z 0 , dz ) , be the occupation measure on [0 , z 0 ], for the sequence of jump points of Z starting from z 0 , and controlled by the optimal θ ∗ ( z ). The density is U ( z 0 , dz ) = 4 λ ( z ) p ( z 0 , z ) dz , where p ( z 0 , z ) is the probability that z is a drift point. Lemma There exists a pointwise limit p ( z ) := lim z 0 →∞ p ( z 0 , z ) , such that lim z →∞ p ( z ) = 1 / 2 and for some a , b > 0 | p ( z 0 , z ) − p ( z ) | < ae − b ( z 0 − z ) , 0 < z < z 0 . The proof is by coupling: two independent Z -processes starting with z 1 and z 2 (where z 1 < z 2 ) with high probability visit the same drift point close to z 1 .
The ‘mean reward’ for Z starting with z > 0 has representation � z u θ ∗ , r ( z ) = r ( y ) U ( z , dy ) . 0 Corollary For integrable r ( z ) , � ∞ u θ ∗ , r ( z ) → r ( y ) λ ( y ) p ( y ) dy , z → ∞ . 0 If r ( z ) = O ( z − β ) with β > 1 then the convergence rate is O ( z − β +1 ) . This allows us to obtain the asymptotic expansions of the moments of N θ ( t ) and of the length of selected sequence L ϕ ( t ) under self-similar strategies. In particular, w ( z ) = ( E N θ ∗ ( z )) 2 satisfies � θ ∗ ( z ) w ′ ( z ) = 4 ( w ( z − y ) − w ( z ) + (1 + 2 u ( z − y ))(1 − y / z ) dy , 0 w (0) = 0 .
A renewal approximation to Z The range of Z is an alternating sequence of drift intervals and gaps skipped by jumps. Let D z be the size of generic drift interval and J z that of jump. From 1 1 12 z + O ( z − 2 ) θ ∗ ( z ) = √ + 2 √ follows that for z → ∞ that 4 λ ( z ) → 2 2 and E → U d d → √ , √ , D z J z 2 2 2 where E and U are independent Exponential(1) and Uniform-[0 , 1] random variables. At distance from 0, the generic jump of Z are approximable by decreasing renewal proces with cycle-size E + U d D z + J z − D z → √ √ =: H 2 2 2
CLT by stochastic comparison Cutsem and Ycart 1994, Haas and Miermont 2011, Alsmeyer and Marynych 2016: limit theorems for absorption times (or jump-counts) for decreasing Markov chains on N . Adapting the stochastic comparison method of Cutsem and Ycart, we squeeze (1 + c / z ) − 1 H < st D z + J z − D z < st (1 − c / z ) − 1 H for z > z , where z = ω √ z and ω large parameter. Accordingly, the number of jumps of Z within [ z , z ] is squeezed between two renewal processes which satisfy the CLT. It is important that the cycle-size of Z is within O ( z − 1 ) from the limit, by slower convergence rate O ( z − 1 / 2+ ǫ ) the normal approximation may fail.
Fluctuations of the shape of selected increasing sequence Y ( s ) the last mark selected by the optimal strategy by time s ∈ [0 , t ]. Theorem For t → ∞ ( t 1 / 4 ( Y ( τ t ) − τ )) τ ∈ [0 , 1] ⇒ Brownian bridge in the Skorohod topology on D [0 , 1].
Recommend
More recommend