counting contingency tables igor pak ucla
play

Counting Contingency Tables Igor Pak, UCLA Combinatorics Seminar, - PowerPoint PPT Presentation

Counting Contingency Tables Igor Pak, UCLA Combinatorics Seminar, OSU, September 17, 2020 1 Contingency tables Fix a = ( a 1 , . . . , a m ), b = ( b 1 , . . . , b n ), a i , b j > 0, s.t. m n a i = b j = N. i =1 j =1 A


  1. Counting Contingency Tables Igor Pak, UCLA Combinatorics Seminar, OSU, September 17, 2020 1

  2. Contingency tables Fix a = ( a 1 , . . . , a m ), b = ( b 1 , . . . , b n ), a i , b j > 0, s.t. m n � � a i = b j = N. i =1 j =1 A contingency table with margins ( a , b ) is an m × n matrix X = ( x ij ), s.t. n m � � x ij = a i , x ij = b j , x ij ≥ 0 ∀ i, j. j =1 i =1 We denote by T ( a , b ) the set of all such matrices, and T( a , b ) := |T ( a , b ) | . Main problem: Compute T( a , b ). That means: formula , algorithm , asymptotics , bounds , etc. More precisely: Do your best!

  3. Examples: a = b = (1 , 1 , 1) − → T( a , b ) = 6 → T( a , b ) = 13268976 ≈ 1 . 3 × 10 7 a = b = (100 , 100 , 100) − → T( a , b ) ≈ 1 . 1 × 10 59 [Canfield–McKay, 2010] m = n = 10, a = b = (20 , . . . , 20) − → T( a , b ) ≈ 2 . 2 × 10 92 m = n = 30, a = b = (3 , . . . , 3) − → T( a , b ) ≈ 6 . 1 × 10 279 [Beck–Pixton, 2003] m = n = 9, a = b = (10 5 , . . . , 10 5 ) − → T( a , b ) = 1225914276768514 ≈ 1 . 2 × 10 15 [Des Jardins, 1994] m = n = 9, a = (220 , 215 , 93 , 64), b = (108 , 286 , 71 , 127) − T( a , b ) ≈ 4 . 3 × 10 61 a = (13070380 , 18156451 , 13365203 , 20567424), b = (12268303 , 20733257 , 17743591 , 14414307) − → [De Loera, 2009] → T( a , b ) ≈ 1 . 7 × 10 819 [good estimate] m = n = 15, a = b = (10 5 , . . . , 10 5 ) − → T( a , b ) ≈ 6 . 3 × 10 33470 [good estimate] m = n = 100, a = b = (10 3 , . . . , 10 3 ) − m = n = 100, nonuniform margins average 10 − → ??? [can be done via SHM in under 200h CPU time] m = n = 1000, nonuniform margins average 100 − → ??? [currently cannot be done in our lifetime]

  4. More Examples: Permutations : m = n , a = b = (1 , . . . , 1) − → T( a , b ) = n ! Magic squares : m = n , a = b = ( k, . . . , k ) [when k fixed, T( a , b ) is P-recursive] → T( a , b ) = c ( n ), where c ( n ) = n 2 c ( n − 1) − 1 2 n ( n − 1) 2 c ( n − 2), so k = 2 − √ e ( n !) 2 c ( n ) ∼ √ πn k = 3 − → T( a , b ) = n ! v ( n ), where 576 n · v ( n ) = (2880 n 2 − 5760 n + 3456) v ( n − 1) + (324 n 5 − 3564 n 4 + 14148 n 3 − 26028 n 2 + 21312 n − 6192) v ( n − 2) + (81 n 6 − 1377 n 5 + 7209 n 4 − 13203 n 3 − 3402 n 2 + 32076 n − 21384) v ( n − 3) + ( − 81 n 7 + 1944 n 6 − 20232 n 5 + 115578 n 4 − 383283 n 3 + 724230 n 2 − 708372 n + 270216) v ( n − 4) + ( − 72 n 6 + 1440 n 5 − 10890 n 4 + 40500 n 3 − 78678 n 2 + 75780 n − 28080) v ( n − 5) + (81 n 9 − 3321 n 8 + 59004 n 7 − 594054 n 6 + 3718687 n 5 − 14927199 n 4 + 38152096 n 3 − 59311746 n 2 + 50236612 n − 17330160) v ( n − 6) + (72 n 8 − 2520 n 7 + 37347 n 6 − 304479 n 5 + 1484133 n 4 − 4394565 n 3 + 7642248 n 2 − 7039116 n + 2576880) v ( n − 7) + ( − 198 n 9 + 8712 n 8 − 165175 n 7 + 1764196 n 6 − 11643772 n 5 + 48965728 n 4 − 130257475 n 3 + 209370724 n 2 − 182126340 n + 64083600) v ( n − 8) + (36 n 10 − 1944 n 9 + 45884 n 8 − 621504 n 7 + 5330892 n 6 − 30123576 n 5 + 112954596 n 4 − 275612976 n 3 + 415021552 n 2 − 343920960 n + 116928000) v ( n − 9) + ( − 9 n 11 + 585 n 10 − 16800 n 9 + 280800 n 8 − 3027357 n 7 + 22034565 n 6 − 110039130 n 5 + 375129450 n 4 − 849926784 n 3 + 1208298600 n 2 − 958439520 n + 315705600) v ( n − 10) + ( − 7 n 10 + 385 n 9 − 9240 n 8 + 127050 n 7 − 1104411 n 6 + 6314385 n 5 − 23918510 n 4 + 58866500 n 3 − 89275032 n 2 + 74400480 n − 25401600) v ( n − 11) + ( n 11 − 66 n 10 + 1925 n 9 − 32670 n 8 + 357423 n 7 − 2637558 n 6 + 13339535 n 5 − 45995730 n 4 + 105258076 n 3 − 150917976 n 2 + 120543840 n − 39916800) v ( n − 12) , so � n � � 3 n 3 3 πn v ( n ) ∼ e 2 4 e 3 2

  5. Complexity aspects: bad news all around Theorem [Narayanan, 2006] Computing T( a , b ) is #P -complete. Theorem [P.–Panova, 2020+, former folklore conjecture ] Computing T( a , b ) is strongly #P -complete (i.e. for the input a i , b j in unary). Corollary [P.–Panova, 2020+] Computing: ◦ Kostka numbers K λµ and Littlewood–Richardson coefficients c λ µν is strongly #P -complete ◦ Schubert coefficients is #P -complete ◦ Kronecker coefficients g ( λ, µ, ν ) and reduced Kronecker coefficients g ( λ, µ, ν ) is #P -hard Note: The last part is known [Ikenmeyer–Mulmuley–Walter, 2017] and [P.–Panova, 2020], resp. Moral: Asymptotic formulas and approximate counting is the best one can hope for.

  6. Connections and Applications • Random networks: contingency tables ↔ bipartite graphs with fixed degrees Note: graphs with fixed degrees ↔ symmetric binary (0-1) CTs with 0 diagonal, numerous papers on all aspects of these, see e.g. [Wormald, 2018 ICM survey] • Statistics Key observation: Random sampling ← → approximate counting Self-reduction : P ( x 11 ≥ t ) = T( a 1 − t, a 2 , . . . ; b 1 − t, b 2 , . . . ) T( a 1 , a 2 , . . . ; b 1 , b 2 , . . . )

  7. Descendants of Queen Victoria (1819 – 1901) Question: Is there a dependence between Birthday and Deathday of the 82 (dead) descendants? Testing correlation for X = ( x ij ) (after Diaconis–Efron, 1985): • Sample large number N of random samples, compute their χ 2 , • Output fraction a/N , where a = number of samples with χ 2 ≤ χ ( X ).

  8. Birthday–Deathday example analysis: Figure 1. Plot of χ 2 from [Diaconis–Sturmfels] and [Dittmer–Pak] Setup: χ 2 ( X ) ≈ 115 . 56, so p-value = % of tables have χ 2 ≤ 115 . 56 Hypothesis: There is NO dependence between Birthday and Deathday. [Diaconis–Sturmfels, 1998]: From the 10 6 trials of Diaconis–Gangolli MC , they get p ≈ 37 . 75% − → Accept! [Dittmer–P., 2019+]: From the 5 × 10 4 trials using our new SHM MC , we get p ≈ 0 . 10% − → Reject! First Moral: It’s important to get good uniform samples from T ( a , b ). Otherwise, you might actually start to believe that there is NO dependence. Second Moral: Dependence, really??? Ah, well, the model was faulty...

  9. Exact and approximate counting results Below: m ≤ n , a 1 ≥ . . . ≥ a m , b 1 ≥ . . . ≥ b n . ◦ Exact counting in poly-time for m, n = O (1) [Barvinok’93] ◦ Exact counting in poly-time for a 1 , b 1 = O (1) via dynamic programming. ◦ Quasi-poly time approx counting for a 1 /a m , b 1 /b n < 1 . 6 and m = Θ( n ) [Barvinok et al, 2010]. ◦ Poly-time approx counting for m = O (1) [Cryan, Dyer 2003] ◦ Poly-time approx counting for a m = Ω( n 3 / 2 m log m ) and b n = Ω( m 3 / 2 n log n ) [Dyer–Kannan–Mount, 1997], [Morris, 2002] ◦ Poly-time approx counting for a 1 , b 1 = Ω( n 1 / 4 − ε ), ε > 0 and m = Θ( n ) [Dittmer–P., 2019+] ◦ Poly-time approx counting for all a i , b j = Θ( n 1 − ε ), ε > 0 and m = Θ( n ) [Dittmer–P., 2019+] Note: These four are all MCMC based FPFAS.

  10. Diaconis–Gangolli Markov chain (1995) STEP: choose a random 2 × 2 submatrix, and make either of the following changes: +1 − 1 − 1 +1 or − 1 +1 +1 − 1 (stay put if this is impossible). Note: Use hit-and-run for large a 1 , b 1 . Note: Early theoretical results in [Diaconis – Saloff-Coste, 1995], [Chung–Graham–Yau, 1996] Split–Hyper–Merge (SHM) Markov chain [Dittmer–P., 2019+] Idea: Use Burnside processes [Jerrum, 1993] ← probabilistic version of the Burnside Lemma . Lemma: T ( a , b ) is in bijection with the set of orbits of group Σ := Sym( a 1 ) × . . . × Sym( a m ) × Sym( b 1 ) × . . . × Sym( b n ) acting on S N = N × N permutation matrices. Conjecture: For a 1 b 1 ≤ poly( mn ), both DG and SHM Markov chains mix in polynomial time.

  11. Why contingency tables are orbits: 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 Σ α 2 1 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0 0 X 0 1 0 0 0 0 0 0 0 1 M Here X ∈ T (3 , 2; 3 , 2) corresponds to orbit representative M ∈ S 5 under the action of Σ = S 3 × S 2 × S 3 × S 2 . (plot of χ 2 ) Testing SHM chain on the Birthday–Deathday example

  12. Independence heuristic [Good, 1950]: T( a , b ) ≈ G( a , b ), where m n � − 1 � N + mn − 1 � a i + n − 1 � � b j + m − 1 � � � G( a , b ) := . mn − 1 n − 1 m − 1 i =1 j =1 Good’s reasoning [Good, 1976]: Let S ( N, m, n ) be the set of m × n tables with total sum N , so � N + mn − 1 � � = � � � S ( N, m, n ) mn − 1 Observe: m 1 � a i + n − 1 � � P ( X has row sums a ) = , |S ( N, m, n ) | n − 1 i =1 n � b j + m − 1 � 1 � P ( X has column sums b ) = . |S ( N, m, n ) | m − 1 j =1 If these events are asymptotically independent: T( a , b ) |S ( N, m, n ) | = P ( X has row sums a , column sums b ) m n � a i + n − 1 � � b j + m − 1 � 1 1 � � ≈ × . |S ( N, m, n ) | n − 1 |S ( N, m, n ) | m − 1 i =1 j =1 “the conjecture appears to be confirmed” [...] “leaving aside finer points of rigor”. �

  13. Does the independence heuristic work? For the Birthday–Deathday example with N = 592: T( a , b ) = 1 . 226 × 10 15 vs. G( a , b ) = 1 . 211 × 10 15 For the large 4 × 4 case with N = 65159458 [De Loera]: T( a , b ) = 4 . 3 × 10 61 vs. G( a , b ) = 3 . 7 × 10 61 Theorem [Canfield–McKay, 2010] For m = n , a = b = ( k, . . . , k ), k = ω (1), k = O (log n ): T( a , b ) ∼ √ e · G( a , b ) as n → ∞ . Theorem [Greenhill–McKay, 2008] For m = n , a 1 b 1 = o ( N 2 / 3 ): T( a , b ) ∼ √ e · G( a , b ) as n → ∞ . Theorem [Barvinok, 2009] For m = n , a = b = ( Bn, . . . , Bn, n, . . . , n ), with θn sums Bn 1 1 lim n 2 log T( a , b ) > lim n 2 log G( a , b ) for all B > 1 . n →∞ n →∞

Recommend


More recommend