The Quintet Poisson–Mellin–Newton–Rice–Laplace Brigitte Vall´ ee CNRS et Universit´ e de Caen Semaine Alea, CIRM, Mars 2016
Plan of the talk ◮ General framework. ◮ Two probabilistic models, the Bernoulli model and the Poisson model. ◮ Description of the tools, the Poisson transform, the Poisson sequence. ◮ Two paths from the Poisson model to the Bernoulli model ◮ Both use the Mellin transform ◮ The first path : Depoissonization path with the Poisson transform. ◮ Uses The Mellin inverse transform and the saddle point. ◮ Need : Depoissonization sufficient conditions, well studied. ◮ The second path :Newton–Rice path with the Poisson sequence. ◮ Uses Newton interpolation and the Rice integral ◮ Need : Tameness conditions, less studied, that seem more restrictive. ◮ Study of sufficient conditions for tameness, ◮ using the inverse Laplace transform
Part I – General framework ◮ Two probabilistic models, the Bernoulli model and the Poisson model. ◮ Description of the tools, the Poisson transform, the Poisson sequence.
General framework. Begin with (elementary) data Consider algorithms which use as inputs finite sequences of data If X is the set of data, then the set of inputs is X ⋆ = � n ≥ 0 X n Context (elementary) data input Study source a symbol from an alphabet a (finite) word entropy text an (infinite) word a sequence of words dictionary geometry a point a sequence of points convex hull Probabilistic studies. ◮ The set X is endowed with probability P ◮ The set X N is endowed with probability P [ N ] In cases (2) and (3), very often, the data are independently drawn with P Not in case (1) where the successive symbols may be strongly dependent.
Two probabilistic models. The space of inputs is the set X ⋆ of the finite sequences of elements of X . There are two main probabilistic models on the set X ⋆ . ◮ The Bernoulli model B n , where the cardinality N is fixed equal to n (then n → ∞ ); The Bernoulli model is more natural in algorithmics. ◮ The Poisson model P z of parameter z , where the cardinality N is a random variable that follows a Poisson law of parameter z , Pr[ N = n ] = e − z z n n ! , (then z → ∞ ). The Poisson model has nice probabilistic properties, notably independence properties = ⇒ easier to deal with. = ⇒ A first study in the Poisson model, followed with a return to the Bernoulli model
Costs of interest. A variable (or a cost) R : X ⋆ → N describes the behaviour of the algorithm on the input, for instance ◮ R ( x ) is the path length of a tree [trie or digital search tree (dst)] built on the sequence x := ( x 1 , . . . , x n ) of words x i ◮ R ( x ) is the number of vertices of the convex hull built on the sequence x = ( x 1 , . . . , x n ) of points x i ◮ R ( w ) is a function of the probability p w of the finite prefix w , with the word w viewed as a sequence w := ( w 1 . . . , w n ) of symbols w i . Our final aim is the analysis of R in the model B n , ◮ We begin with the analysis in the (easier) Poisson model P z , ◮ We then wish to return in the (more realistic) Bernoulli model.
Average-case analysis of a cost R defined on X ⋆ ◮ Final aim : Study the sequence n �→ r ( n ) , r ( n ) := E [ n ] [ R ] := the expectation in the Bernoulli model B n ◮ Consider the expectation E z [ R ] in the Poisson model P z � E z [ R ] = E z [ R | N = n ] P z [ N = n ] n ≥ 0 � E [ n ] [ R ] P z [ N = n ] = e − z � r ( n ) z n = n ! n ≥ 0 n ≥ 0 E z [ R ] is the Poisson transform of the sequence n �→ r ( n ) . ◮ With (properties of) the Poisson transform P ( z ) of n �→ r ( n ) return to (the asymptotics of) the sequence n �→ r ( n )
The Poisson transform and the Poisson sequence With a sequence f : n �→ f ( n ) , we associate P ( z ) = e − z � � f ( k ) z k ( − 1) k z k k ! = k ! p ( k ) k ≥ 0 k ≥ 0 ◮ The series P ( z ) := P [ f ]( z ) is the Poisson transform of n �→ f ( n ) . ◮ The sequence k �→ p ( k ) is the Poisson sequence of n �→ f ( n ) . ◮ It is denoted by π [ f ] . ◮ Its Poisson transform is P ( − z ) e − z . ◮ Under this form, it is clear that the map π is involutive. ◮ Important binomial relation between f ( n ) and p ( n ) � n � � n � n n � � ( − 1) k ( − 1) k p ( n ) = f ( k ) , and f ( n ) = p ( k ) . k k k =0 k =0
An instance of application: Toll functions and tries (I). A source S on a finite alphabet Σ X ⋆ := { sequences of (infinite) words produced by S} The trie T ( x ) built on x ∈ X ⋆ is a tree : a b c ◮ If | x | = 0 , T ( x ) = ∅ a b c a b c ◮ If | x | = 1 , x = ( x ) , T ( x ) is a leaf abc cba bbc cab labeled by x . c a ◮ If | x | ≥ 2 , then T ( x ) is formed with – an internal node o a a b – and a sequence of tries T ( x � σ � ) for σ ∈ Σ b c b c b b b ◮ x � σ � is the subsequence of x formed with words which begin with σ ◮ x � σ � is formed with words of x � σ � stripped of their initial symbol σ . ◮ If x � σ � � = ∅ , the edge o → T ( x � σ � ) is labelled with σ .
An instance of application: Toll functions and tries (II). A sequence n �→ f ( n ) with val( f ) = 2 plays the role of a toll function. With the toll f , associate the cost R defined on X ⋆ a b c � R ( x ) := f ( | x � w � | ) , w ∈ Σ ⋆ a b c a b c abc cba bbc cab ◮ x � w � is the subsequence of x c a formed with words which begin with the prefix w a b c a b ◮ f ( | x � w � | ) is the toll “payed” b c b b b by the subtrie T ( x � w � ) of root labelled by w f ( k ) = 1 = ⇒ R ( x ) is the number of internal nodes of T ( x ) f ( k ) = k = ⇒ R ( x ) is the external path length of T ( x ) Another instance (less classical) : f ( k ) = k log k = ⇒ ..... R ( x ) is the number of symbol comparisons performed by QuickSort on x . What is the mean value of the cost R ( x ) when x ∈ X n ?
An instance of application: Toll functions and tries (III). � � Remind: R ( x ) := f ( | x � w � | ) = f ( N w ( x )) w ∈ Σ ⋆ w ∈ Σ ⋆ where N w is the number of words which begin with w . What is given? – the source with the probabilities p w . – the toll sequence n �→ f ( n ) , its transform P ( z ) and its sequence π [ f ] P ( z ) = E z [ f ( N ))] = e − z � � f ( n ) z n ( − 1) n p ( n ) z n n ! = n ! n ≥ 2 n ≥ 2 N follows P z = ⇒ N w follows P zp w = ⇒ E z [ f ( N w )] = P ( zp w ) . What about r ( n ) := E [ n ] [ R ] , its Poisson transform, its Poisson sequence? � � Q ( z ) := E z [ R ] = E z [ f ( N w )] = ⇒ Q ( z ) = P ( zp w ) w ∈ Σ ⋆ w ∈ Σ ⋆ � � � Q ( z ) = e − z � � r ( n ) z n ( − 1) n q ( n ) z n p n n ! = n ! = ⇒ q ( n ) = p ( n ) w w ∈ Σ ⋆ n ≥ 2 n ≥ 2 Sequence f ( n ) and source S = ⇒ Q ( z ) and q ( n ) How to return to r ( n ) ?
Part II – Description of the two paths. Generic tools. Two paths from the Poisson model to the Bernoulli model ◮ Both use the Mellin transform
Description of the two possible paths. Begin with a sequence k �→ f ( k ) , consider its Poisson transform P ( z ) and its Poisson sequence π [ f ] : n �→ p ( n ) , P ( z ) = e − z � � f ( k ) z k ( − 1) n z n k ! = n ! p ( n ) k ≥ 0 n ≥ 0 Assume some “knowledge” on the Poisson transform P ( z ) or the Poisson sequence π [ f ] . There are two paths for returning to the initial sequence ◮ Depoissonisation method ( DP ) ◮ Deal with P ( z ) , find its asymptotics ( z → ∞ ) ◮ Compare the asymptotics of the sequence f ( n ) ( n → ∞ ) to the asymptotics of P ( n ) ◮ Rice method ( Ri ) ◮ Deal with the sequence π [ f ] : n �→ p ( n ) , ◮ and its analytic lifting π [ f ] which is proven to exist ◮ Return to the sequence n �→ f ( n ) via the binomial formula which is tranfered into an integral, the Rice integral.
A first technical condition: Valuation-Degree Condition Definition. For a non zero real sequence n �→ f ( n ) , define val( f ) := min { k | f ( k ) � = 0 } , � log f ( k ) � deg( f ) := inf { c | f ( k ) = O ( k c ) } = lim sup | k ≥ k 0 . log k The sequence n �→ f ( n ) satisfies the Valuation-Degree Condition ( VD ), if and only if d := deg( f ) < k 0 := val( f ) . If n �→ f ( n ) is of polynomial growth, then deg ( f ) is finite. In this case, the VD -Condition is not restrictive: Replace f by f + f + ( n ) = 0 for n ≤ d, f + ( n ) = f ( n ) for n > d . We always assume the VD -Condition to hold, with a difference d − k 0 as smallest as wished.
A second technical tool: the canonical sequence. When val( f ) = k 0 , P ( z ) is written as Q ( z ) = e − z � � g ( k ) z k ( − 1) n z n P ( z ) = z k 0 Q ( z ) , k ! = n ! q ( n ) . k ≥ 0 n ≥ 0 The sequence k �→ g ( k ) is the canonical sequence associated with k �→ f ( k ) f ( k + k 0 ) g ( k ) = for k ≥ 0 . ( k + 1) . . . ( k + k 0 ) It satisfies the VD -Condition, with val( g ) = 0 and deg g = d − k 0 < 0 . Sufficient to consider sequences with val( g ) = 0 and deg g = d − k 0 < 0 . There are relations to return to the initial sequence f ( n ) ◮ between the Poisson sequences k �→ q ( k ) and k �→ p ( k ) p ( k + k 0 ) = ( k + k 0 ) . . . ( k + 1) q ( k ) for k ≥ 0 . ◮ between the Poisson transforms P ( z ) = z k 0 Q ( z ) f ( k + 2) 1 Ex: f ( k ) = k log k with k 0 = 2 = ⇒ g ( k ) = ( k + 1)( k + 2) = k + 1 log( k +2)
Recommend
More recommend