kolmogorov complexity as a language
play

Kolmogorov complexity as a language Alexander Shen LIF CNRS, - PowerPoint PPT Presentation

CSR-2011 Kolmogorov complexity as a language Alexander Shen LIF CNRS, Marseille; on leave from , . . . . . . A powerful tool Just a way to reformulate arguments three languages:


  1. Algorithmic version: every (algorithmically) random sequence satisfies SLLN algorithmic classical: Martin-Löf random sequences form a set of measure . Foundations of probability theory ◮ Random object or random process? ◮ “well shuffled deck of cards”: any meaning? [xkcd cartoon] ◮ randomness = incompressibility (maximal complexity) ◮ ω = ω 1 ω 2 . . . is random iff KM ( ω 1 . . . ω n ) ≥ n − O (1) ◮ Classical probability theory: random sequence satisfies the Strong Law of Large Numbers with probability 1 . . . . . .

  2. algorithmic classical: Martin-Löf random sequences form a set of measure . Foundations of probability theory ◮ Random object or random process? ◮ “well shuffled deck of cards”: any meaning? [xkcd cartoon] ◮ randomness = incompressibility (maximal complexity) ◮ ω = ω 1 ω 2 . . . is random iff KM ( ω 1 . . . ω n ) ≥ n − O (1) ◮ Classical probability theory: random sequence satisfies the Strong Law of Large Numbers with probability 1 ◮ Algorithmic version: every (algorithmically) random sequence satisfies SLLN . . . . . .

  3. Foundations of probability theory ◮ Random object or random process? ◮ “well shuffled deck of cards”: any meaning? [xkcd cartoon] ◮ randomness = incompressibility (maximal complexity) ◮ ω = ω 1 ω 2 . . . is random iff KM ( ω 1 . . . ω n ) ≥ n − O (1) ◮ Classical probability theory: random sequence satisfies the Strong Law of Large Numbers with probability 1 ◮ Algorithmic version: every (algorithmically) random sequence satisfies SLLN ◮ algorithmic ⇒ classical: Martin-Löf random sequences form a set of measure 1 . . . . . . .

  4. A device that (being switched on) produces N -bit string and stops “The device produces a random string”: what does it mean? classical: the output distribution is close to the uniform one effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) . . . . . .

  5. “The device produces a random string”: what does it mean? classical: the output distribution is close to the uniform one effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops . . . . . .

  6. classical: the output distribution is close to the uniform one effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? . . . . . .

  7. effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one . . . . . .

  8. not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one ◮ effective: with high probability the output string is incompressible . . . . . .

  9. but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one ◮ effective: with high probability the output string is incompressible ◮ not equivalent if no assumptions about the device . . . . . .

  10. Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one ◮ effective: with high probability the output string is incompressible ◮ not equivalent if no assumptions about the device ◮ but are related under some assumptions . . . . . .

  11. k k minor of n n Boolean matrix: select k rows and k columns minor is uniform if it is all-0 or all-1. claim: there is a n n bit matrix without k k uniform minors for k log n . Example: matrices without uniform minors . . . . . .

  12. minor is uniform if it is all-0 or all-1. claim: there is a n n bit matrix without k k uniform minors for k log n . Example: matrices without uniform minors ◮ k × k minor of n × n Boolean matrix: select k rows and k columns . . . . . .

  13. Example: matrices without uniform minors ◮ k × k minor of n × n Boolean matrix: select k rows and k columns ◮ minor is uniform if it is all-0 or all-1. ◮ claim: there is a n × n bit matrix without k × k uniform minors for k = 3 log n . . . . . . .

  14. n k positions of the minor [ k n k log n ] types of uniform minors (0/1) k possibilities for the rest n k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation . . . . . .

  15. types of uniform minors (0/1) k possibilities for the rest n k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] . . . . . .

  16. k possibilities for the rest n k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) . . . . . .

  17. k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest . . . . . .

  18. n . log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = . . . . . .

  19. log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . . . . . . .

  20. one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total . . . . . .

  21. n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total ◮ one additional bit to specify the type of minor ( 0 / 1 ) . . . . . .

  22. k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total ◮ one additional bit to specify the type of minor ( 0 / 1 ) ◮ n 2 − k 2 bits to specify the rest of the matrix . . . . . .

  23. Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total ◮ one additional bit to specify the type of minor ( 0 / 1 ) ◮ n 2 − k 2 bits to specify the rest of the matrix ◮ 2 k log n + 1 + ( n 2 − k 2 ) = 6 log 2 n + 1 + ( n 2 − 9 log 2 n ) < n 2 . . . . . . .

  24. copying n -bit string on 1-tape TM requires n time complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O n d . K u t O n d proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u t so the length of it should be K u t ; the sum of lengths does not exceed time One-tape Turing machines . . . . . .

  25. complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O n d . K u t O n d proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u t so the length of it should be K u t ; the sum of lengths does not exceed time One-tape Turing machines ◮ copying n -bit string on 1-tape TM requires Ω( n 2 ) time . . . . . .

  26. proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u t so the length of it should be K u t ; the sum of lengths does not exceed time One-tape Turing machines ◮ copying n -bit string on 1-tape TM requires Ω( n 2 ) time ◮ complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O ( n / d ) . K ( u ( t )) ≤ O ( n / d ) . . . . . .

  27. One-tape Turing machines ◮ copying n -bit string on 1-tape TM requires Ω( n 2 ) time ◮ complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O ( n / d ) . K ( u ( t )) ≤ O ( n / d ) ◮ proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u ( t ) so the length of it should be Ω( K ( u ( t )) ; the sum of lengths does not exceed time . . . . . .

  28. Random sequence has n -bit prefix of complexity n but some factors (substrings) have small complexity Levin: there exist everywhere complex sequences: every n -bit substring has complexity n O Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences . . . . . .

  29. but some factors (substrings) have small complexity Levin: there exist everywhere complex sequences: every n -bit substring has complexity n O Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n . . . . . .

  30. Levin: there exist everywhere complex sequences: every n -bit substring has complexity n O Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity . . . . . .

  31. Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity ◮ Levin: there exist everywhere complex sequences: every n -bit substring has complexity 0 . 99 n − O (1) . . . . . .

  32. combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity ◮ Levin: there exist everywhere complex sequences: every n -bit substring has complexity 0 . 99 n − O (1) ◮ Combinatorial equivalent: Let F be a set of strings that has at most 2 0 . 99 n strings of length n . Then there is a sequence ω s.t. all sufficiently long substrings of ω are not in F . . . . . . .

  33. Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity ◮ Levin: there exist everywhere complex sequences: every n -bit substring has complexity 0 . 99 n − O (1) ◮ Combinatorial equivalent: Let F be a set of strings that has at most 2 0 . 99 n strings of length n . Then there is a sequence ω s.t. all sufficiently long substrings of ω are not in F . ◮ combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) . . . . . .

  34. coding theory: how many n -bit strings x x k one can find if Hamming distance between every two is at least d lower bound (Gilbert–Varshamov) then d changed bits are harmless but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound . . . . . .

  35. lower bound (Gilbert–Varshamov) then d changed bits are harmless but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d . . . . . .

  36. then d changed bits are harmless but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) . . . . . .

  37. but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless . . . . . .

  38. general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless ◮ but bit insertion or deletions could be . . . . . .

  39. generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless ◮ but bit insertion or deletions could be ◮ general requirement: C ( x i | x j ) ≥ d . . . . . .

  40. Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless ◮ but bit insertion or deletions could be ◮ general requirement: C ( x i | x j ) ≥ d ◮ generalization of GV bound: d -separated family of size Ω(2 n − d ) . . . . . .

  41. C x y C x C y x O log C x y C x C y x O log C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation . . . . . .

  42. C x y C x C y x O log C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) . . . . . .

  43. C x y C x C y x O log C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) . . . . . .

  44. C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) ≥ C ( x ) + C ( y | x ) + O ( log ) . . . . . .

  45. l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) ≥ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) < k + l ⇒ C ( x ) < k + O ( log ) or C ( y | x ) < l + O ( log ) . . . . . .

  46. Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) ≥ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) < k + l ⇒ C ( x ) < k + O ( log ) or C ( y | x ) < l + O ( log ) ◮ every set A of size < 2 k + l can be split into two parts A = A 1 ∪ A 2 such that w ( A 1 ) ≤ 2 k and h ( A 2 ) ≤ 2 l . . . . . .

  47. C x y z C x y C y z C x z V S S S Also for Shannon entropies; special case of Shearer lemma One more inequality . . . . . .

  48. V S S S Also for Shannon entropies; special case of Shearer lemma One more inequality ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) . . . . . .

  49. Also for Shannon entropies; special case of Shearer lemma One more inequality ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) ◮ V 2 ≤ S 1 × S 2 × S 3 . . . . . .

  50. One more inequality ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) ◮ V 2 ≤ S 1 × S 2 × S 3 ◮ Also for Shannon entropies; special case of Shearer lemma . . . . . .

  51. mutual information: I a b C a C b C a b common information: combinatorial: graph minors can the graph be covered by minors of size ? Common information and graph minors . . . . . .

  52. common information: combinatorial: graph minors can the graph be covered by minors of size ? Common information and graph minors ◮ mutual information: I ( a : b ) = C ( a ) + C ( b ) − C ( a , b ) . . . . . .

  53. combinatorial: graph minors can the graph be covered by minors of size ? Common information and graph minors ◮ mutual information: I ( a : b ) = C ( a ) + C ( b ) − C ( a , b ) ◮ common information: . . . . . .

  54. Common information and graph minors ◮ mutual information: I ( a : b ) = C ( a ) + C ( b ) − C ( a , b ) ◮ common information: ◮ combinatorial: graph minors can the graph be covered by 2 δ minors of size 2 α − δ × 2 β − δ ? . . . . . .

  55. nonuniformity= (maximal section)/(average section) Theorem: every set of N elements can be represented as union of polylog N sets whose nonuniformity is polylog N . multidimensional version how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets . . . . . .

  56. Theorem: every set of N elements can be represented as union of polylog N sets whose nonuniformity is polylog N . multidimensional version how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) . . . . . .

  57. multidimensional version how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . . . . . . .

  58. how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version . . . . . .

  59. so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version ◮ how to construct parts using Kolmogorov complexity: take strings with given complexity bounds . . . . . .

  60. but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version ◮ how to construct parts using Kolmogorov complexity: take strings with given complexity bounds ◮ so simple that it is not clear what is the combinatorial translation . . . . . .

  61. Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version ◮ how to construct parts using Kolmogorov complexity: take strings with given complexity bounds ◮ so simple that it is not clear what is the combinatorial translation ◮ but combinatorial argument exists (and gives even a stronger result) . . . . . .

  62. is a random variable; k values, probabilities p p k N : N independent trials of Shannon’s informal question: how many bits are needed to N ? encode a “typical” value of Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem . . . . . .

  63. N : N independent trials of Shannon’s informal question: how many bits are needed to N ? encode a “typical” value of Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k . . . . . .

  64. Shannon’s informal question: how many bits are needed to N ? encode a “typical” value of Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ . . . . . .

  65. Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? . . . . . .

  66. formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? ◮ Shannon’s answer: NH ( ξ ) , where H ( ξ ) = p 1 log (1/ p 1 ) + . . . + p n log (1/ p n ) . . . . . . .

  67. N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? ◮ Shannon’s answer: NH ( ξ ) , where H ( ξ ) = p 1 log (1/ p 1 ) + . . . + p n log (1/ p n ) . ◮ formal statement is a bit complicated . . . . . .

  68. Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? ◮ Shannon’s answer: NH ( ξ ) , where H ( ξ ) = p 1 log (1/ p 1 ) + . . . + p n log (1/ p n ) . ◮ formal statement is a bit complicated ◮ Complexity version: with high probablity the value of ξ N has complexity close to NH ( ξ ) . . . . . . .

  69. C x y z C x y C y z C x z O log The same for entropy: H H H H …and even for the sizes of subgroups U V W of some finite group G : log G U V W log G U V log G U W log G V W . in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size . . . . . .

  70. The same for entropy: H H H H …and even for the sizes of subgroups U V W of some finite group G : log G U V W log G U V log G U W log G V W . in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) . . . . . .

  71. …and even for the sizes of subgroups U V W of some finite group G : log G U V W log G U V log G U W log G V W . in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) . . . . . .

  72. in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . . . . . . .

  73. some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . ◮ in all three cases inequalities are the same (Romashchenko, Chan, Yeung) . . . . . .

  74. Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . ◮ in all three cases inequalities are the same (Romashchenko, Chan, Yeung) ◮ some of them are quite strange: I ( a : b ) ≤ ≤ I ( a : b | c )+ I ( a : b | d )+ I ( c : d )+ I ( a : b | e )+ I ( a : e | b )+ I ( b : e | a ) . . . . . .

  75. Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . ◮ in all three cases inequalities are the same (Romashchenko, Chan, Yeung) ◮ some of them are quite strange: I ( a : b ) ≤ ≤ I ( a : b | c )+ I ( a : b | d )+ I ( c : d )+ I ( a : b | e )+ I ( a : e | b )+ I ( b : e | a ) ◮ Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a , b , e . . . . . . .

  76. a b : two strings we look for a program p that maps a to b by definition C p is at least C b a but could be higher there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf . . . . . .

  77. we look for a program p that maps a to b by definition C p is at least C b a but could be higher there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf ◮ a , b : two strings . . . . . .

  78. by definition C p is at least C b a but could be higher there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf ◮ a , b : two strings ◮ we look for a program p that maps a to b . . . . . .

  79. there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf ◮ a , b : two strings ◮ we look for a program p that maps a to b ◮ by definition C ( p ) is at least C ( b | a ) but could be higher . . . . . .

Recommend


More recommend