Digital Trees and Memoryless Sources: from Arithmetics to Analysis - PowerPoint PPT Presentation

Digital Trees and Memoryless Sources: from Arithmetics to Analysis Philippe Flajolet, Mathieu Roux, Brigitte Vallée AofA 2010, Wien Friday, June 25, 2010 1

What is a digital tree, aka “TRIE”? A = Finite alphabet = a data structure for dynamic dictionaries W = infinite sequences E : W n -> tree TOP-DOWN construction: Set E is split into E a ,..., E z , according to initial letter; continue with next letter; stop when elements are separated. INCREMENTAL construction: start with the empty tree and insert elements of E one after the other. E={a..., bba..., bbb...} Friday, June 25, 2010 2

What does a trie look like? (mean size) n here: uniform data A random trie on n=500 uniform binary sequences; size =741 internal nodes; height=18 Friday, June 25, 2010 3

What does a trie look like? Expected size seems to be asymptotically linear. Convergence to asymptotic regime seems to be fast . But...Things are not n quite as they seem! Friday, June 25, 2010 4

Probabilistic model: Memoryless sources A finite alphabet A = { a 1 , . . . , a r } . Letters drawn independently to form words from W = A ∞ : P ( a j ) = p j . Words drawn independently : model is W n . Want fixed number, n items, to build the trie. Poisson Often use N = P oisson( x ) items: P ( N = n ) = e − x x n n ! . Expect ( ± elementarily) P ( x ) ≈ fixed- n , when x ≈ n . Friday, June 25, 2010 5

Memoryless sources (Bernoulli) 1965: Knuth & De Bruijn analyse binary tries, with Pr(0)=Pr(1)=1/2, showing oscillations . 1973: Knuth discusses biased bit models , including golden-section case [Ex 5.2.2-53] 1986: Fayolle-F-Hofri exhibit periodicity criterion , extended by, e.g., Schachinger [2000]; Jacquet-Szpankowski-Tang [2001] 1990-2000: Convergence to asymptotic regime often wrongly assumed to be fast. Caveats by Schachinger (~2000). 2010; this paper: convergence to asymptotic regime is very slow and depends on fine arithmetic properties of probabilistic model . Friday, June 25, 2010 6

The periodic case Definition The probability vector ( p 1 , . . . , p r ) is periodic if — all ratios log p j (E.g., log p 2 are rational. log p 1 ∈ Q ; binary alph.) log p k Theorem (Periodic sources; folklore) Expected size S n is, with Φ a smooth periodic function: S n = n H + n Φ (log n ) + O ( n 1 − A ) , A > 0 . = ⇒ Oscillations ( O ( n )), plus good error term. • These cases are exceptional : the p j are algebraic numbers . Such families are a denumerable set; hence have measure 0. Friday, June 25, 2010 7

The aperiodic case (main result) Definition The probability vector ( p 1 , . . . , p r ) is aperiodic if — at least one ratio log p j (E.g., log p 2 is irrational. log p 1 �∈ Q ; binary a.) log p k Theorem (Aperiodic sources; this paper) Expected size S n is, for “diophantine sources” (generic case) S n = n � � � n exp( − θ H + O log n ) θ > 1 . , This is better than n / ( logn ) a , any a; much worse than n 1 − ǫ , any ǫ . • For remaining “Liouvillean sources” (rare), error term can come arbitrarily close to o ( n ) . = ⇒ No oscillation, but poor error term. • This case is generic : it has has measure 1. Friday, June 25, 2010 8

1. Basics Fundamental intervals + Mellin = Formal analysis Friday, June 25, 2010 9

[Vallée 1997++] (0) (1) View source model in terms of fundamental intervals: w -> p w Size = Number of places occupied by at least two prefixes Mellinize ->... Friday, June 25, 2010 10

The Mellin transform � ∞ f ( x ) x s − 1 dx M f ⋆ ( s ) := f ( x ) � 0 (It exists in strips of C determined by growth of f ( x ) at 0 , + ∞ .) Property 1. Factors harmonic sums : � � � M � λ µ − s · f ⋆ ( x ) . λ f ( µ x ) � ( λ ,µ ) ( λ ,µ ) Property 2. Maps asymptotics of f on singularities of f ⋆ : 1 f ( x ) ≈ x − s 0 (log x ) m − 1 . f ⋆ ≈ = ⇒ ( s − s 0 ) m Singularities? Proof of P 2 is from Mellin inversion + residues: Z c + i ∞ 1 f ⋆ ( s ) x − s ds . f ( x ) = 2 i π c − i ∞ Friday, June 25, 2010 11

Lambda(s) Harmonic sum! Singularities? Geometry of the poles of Friday, June 25, 2010 12

2. Geometry of poles Poles are associated with simultaneous approximations to logs of probabilities Distinguish: -- Diophantine = badly approximable (generic); -- Liouvillean = unusally well approximable (rare) Friday, June 25, 2010 13

Poles of Λ ( s ) near ℜ ( s ) = 1 • Look for s : p s 1 + p s 2 = 1 , s = σ + it . 1 p it 2 p it 1 + p σ 2 = 1 , p 1 + p 2 = 1 . p σ 2 π 2 π Implies p it 1 ≈ 1 and p it 2 ≈ 1; i.e., t ≈ q 2 . q 1 and t ≈ log p 1 log p 2 log p 2 ≈ q 2 . log p 1 q 1 ⇒ “good” rational approximation to log p 2 Pole of Λ ( s ) = . log p 1 For general ( p 1 , . . . , p r ), must have a common denominator q 1 : log p j ∀ j : q 1 log p 1 is a near-integer . Friday, June 25, 2010 14

Poles of Λ ( s ) near ℜ ( s ) = 1 β = ( β 1 , . . . , β r ) ∈ R r ; fix a norm � · � on R r . { x } = centred fractional part; � { β } � is distance to nearest integer lattice point. Look at “ record ” approximants; measure quality by f ( t ) . Definition • Q is a Best Simultaneous Approximant Denominator (BSAD), if � { Q β } � < � { q β } � , for all q < Q . 1 • f ( t ), the approximation function , is staircase and f ( t ) = � { Q − β } � . , if Q − , Q + are the BSADs that frame t . Thus: Friday, June 25, 2010 15

Basic trichotomy For a probability vector ( p 1 , . . . , p r ): Periodic sources (All ratios of logs are in Q ) Aperiodic sources (Some ratios �∈ Q ): Diophantine : approximation function f ( t ) is polynomial ; optimal exponent is known as irrationality measure ; Liouvillean : approximation function f ( t ) is superpolynomial . √ 3 — Scalars π , e , tan(1) , 2 , ζ (3) , log 5 , . . . are Diophantine. Logs of rational and algebraic numbers are Diophantine. Also numbers with bounded continued fraction quotients, . . . — Numbers with very fast-converging sums, e.g., � 2 − 2 n , are Liouvillean. Friday, June 25, 2010 16

Theorem If ( p 1 , . . . , p r ) is Diophantine, zeros are well-separated from ℜ ( s ) : All zeros are to the left of a pseudo-hyperbola; Infinitely many zeros are to the right of a pseudo-hyperbola. Theorem If ( p 1 , . . . , p r ) is Liouvillean, zeros come closer to ℜ ( s ) = 1 : All zeros are to the left of a curve 1 − 1 / F − ( t ) ; Infinitely many zeros are to the right of − 1 + 1 / F + ( t ) . F − ( t ) , F + ( t ) are dictated by approximation functions of (log p j ) / (log p k ) . Friday, June 25, 2010 17

Proofs ladder • Pole of Λ ( s ) = ⇒ “good” rational approximation to (log p j )(log p k ). — Follow sketch above and develop properties of “ladders” . ~ 1/f(q) 2 pole “Good” rational approximation • BSAD, q to (log p j ) / (log p k ) = Pole of Λ ( s ) ⇒ 1 . — use analytic, multivariate Implicit ++ Lapidus, Function Theorem , ℜ ( s ) ≈ 1; u j ≈ 0: van Frankenhuijsen 1 p iu 1 1 − p s − · · · p s r p iu r = 0 . r 1 Friday, June 25, 2010 18

3. Inverse Mellin analysis Make use of integration contour that avoids poles Estimate global contribs: pole-free region matters Poles are well-separated Friday, June 25, 2010 19

4. Tries and QuickSort Applies to size of tries & almost anything that contains Lambda(s). Diophantine => error terms are exp-of-root-of-log Liouvillean => error terms are o(n) and very close to O(n) Friday, June 25, 2010 20

Theorem Consider aperiodic Diophantine probabilities with irrationality exponent µ .  S n = n  H + n Φ ( n ) trie size;       S n = 1 H n log n + Cn + n Φ ( n ) trie pathlength:    S n = 2 H n log 2 n + Cn log n + C ′ n + n Φ ( n ) ,   symbol-cost, Quicksort:   where error term is, for any θ > µ : � � �� − (log n ) 1 / θ Φ ( x ) = O exp , Makes precise or improves on results of Clément, Fill, Flajolet, Jacquet, Janson, Szpankowski, Vallée,... Friday, June 25, 2010 21

Source models memoryless periodic: good error terms aperiodic: generally (very) bad error terms (us!) Diophantine versus Liouvillean Markov ; cf Szpa+Jacquet+Tang: similar (?) dynamical : Vallée + Cl-F-Vallée; cf Dolgopyat, B-V . general : à la Vallée-Clément-Fill-F . Friday, June 25, 2010 22

Numerics (Proved for Poisson; transfers to fixed-size) Initial oscillations often not seen numerically , for small n ; but they matter asymptotically Friday, June 25, 2010 23

Digital Trees and Memoryless Sources: from Arithmetics to Analysis - PowerPoint PPT Presentation

Digital Trees and Memoryless Sources: from Arithmetics to Analysis Philippe Flajolet, Mathieu Roux, Brigitte Valle AofA 2010, Wien Friday, June 25, 2010 1 What is a digital tree, aka TRIE? A = Finite alphabet = a data structure for

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

Algorithmic Arithmetics with DD-Finite Functions Implementation and Issues Antonio

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

2-3-4 Trees and Red- Black Trees 204 erm CS 16: Balanced Trees 2-3-4 Trees Revealed Nodes

/ + - * * 5 3 2 6 5 2 Examples Binary Trees BSTs Augmenting BinExpr General Trees

Splay Trees and B-Trees CSE 373 Data Structures Lecture 9 Readings Reading Sections

Algorithms and Data Structures Balanced Trees (AVL-Trees, (a,b)-Trees, Red-Black-Trees)

Tournament Trees Winner trees. Loser Trees. Winner Trees Complete binary tree with n external

Binary trees Binary trees David Morgan Binary trees Binary trees elements have up to 2

Trees Applied Multivariate Statistics Spring 2012 Overview Intuition for Trees

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil

The number of spanning trees of random 2 -trees Stephan Wagner (joint work with Elmar Teufl)

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

Example - Newton-Raphson Method We now consider the following example: 3 x 3 - x 4 ( ) - - f

48-175 Descriptive Geometry Introduction to Geometric Constructions can you work out the area

Chapter 2 Section 3 MA1020 Quantitative Literacy Sidney Butler Michigan Technological University

Line Search 2 Lecture 4 ME EN 575 Andrew Ning aning@byu.edu Outline Root Finding Methods 1D

a solution with objective value at least times that of an optimum solution. The value is then

Building a Investor Presentation Low-Cost Mid-Tier February 2020 TSX:TGZ / OTCQX:TGCDF West

Recursion and Proofs by Induction CS1200, CSE IIT Madras Meghana Nasre March 20, 2020 CS1200,

CS 152: Discussion Section 2 Pipelining Review Yue Dai, Albert Ou 02/07/2020 Administrivia PS1