doubly competitive distribution estimation
play

Doubly-Competitive Distribution Estimation Yi Hao and Alon Orlitsky - PowerPoint PPT Presentation

Doubly-Competitive Distribution Estimation Yi Hao and Alon Orlitsky Department of Electrical and Computer Engineering University of California, San Diego ICML, June 11, 2019 Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution


  1. Doubly-Competitive Distribution Estimation Yi Hao and Alon Orlitsky Department of Electrical and Computer Engineering University of California, San Diego ICML, June 11, 2019 Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 1 / 7

  2. Distribution Estimation p - unknown distribution over { 1 , 2 , . . . , k } X n := X 1 , X 2 , . . . , X n ∼ p independently q Xn - estimate based on X n Loss: Kullback-Leibler divergence k p ( x ) log p ( x ) � ℓ ( p , q Xn ) := q Xn ( x ) x =1 Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 2 / 7

  3. Competitive Distribution Estimation All reasonable estimators are natural Same probability to symbols appearing same # times q abbc ( a ) = q abbc ( c ) Goal: Estimate every p as well as best natural estimator Genie-estimator: knows p , but natural, hence incurs a loss Opt ( p , X n ) := q - natural ℓ ( p , q Xn ) min (Orlitsky & Suresh, 2015) Good-Turing variation q GT For every p , with high probability � 1 � √ n ∧ k ℓ ( p , q GT Xn ) ≤ Opt ( p , X n ) + O n Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 3 / 7

  4. Doubly-Competitive Distribution Estimation D Φ := # of distinct frequencies of symbols in X n X n = a b a c d e = ⇒ a appeared twice, b c d e appeared once = ⇒ D Φ = 2 Single estimator q ⋆ achieving (w.h.p.) � D Φ � ℓ ( p , q ⋆ Xn ) ≤ Opt ( p , X n ) + O n √ Uniform bound : D Φ ≤ 2 n ∧ k = ⇒ (Orlitsky & Suresh, 2015) Better bounds for many distribution classes: 1 1 3 ; Uniform: D Φ � n T -step: D Φ � T · n 3 � 1 � n 2 3 Log-concave with SD ≈ σ : D Φ � σ ∧ σ α Enveloped power-law { p : p ( x ) � x − α } : D Φ � n − α +1 Log-convex distribution families, etc. Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 4 / 7

  5. Estimator Construction Φ ( t ) := # of symbols appearing t times Good-Turing Estimator q GT ( x ) := t + 1 · Φ ( t + 1) Φ ( t ) n Observation : For x appearing t � log n times, and Φ ( t ) � log 2 n q GT has sub-optimal variance in estimating p ( x ) Averaging unbiased estimators reduces the variance D ( t ) := weighted average of Φ ( t ′ ) for | t ′ − t | � � t / log n q ⋆ ( x ) := t + 1 · D ( t + 1) , D ( t ) n For other x , use Good-Turing or empirical Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 5 / 7

  6. Experimental Results Two-step distribution Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 6 / 7

  7. Thank You Yi Hao and Alon Orlitsky (UCSD) Doubly-Competitive Distribution Estimation ICML, June 11, 2019 7 / 7

Recommend


More recommend