Kolmogorov complexity as a language Alexander Shen LIF CNRS, - PowerPoint PPT Presentation

Algorithmic version: every (algorithmically) random sequence satisfies SLLN algorithmic classical: Martin-Löf random sequences form a set of measure . Foundations of probability theory ◮ Random object or random process? ◮ “well shuffled deck of cards”: any meaning? [xkcd cartoon] ◮ randomness = incompressibility (maximal complexity) ◮ ω = ω 1 ω 2 . . . is random iff KM ( ω 1 . . . ω n ) ≥ n − O (1) ◮ Classical probability theory: random sequence satisfies the Strong Law of Large Numbers with probability 1 . . . . . .

algorithmic classical: Martin-Löf random sequences form a set of measure . Foundations of probability theory ◮ Random object or random process? ◮ “well shuffled deck of cards”: any meaning? [xkcd cartoon] ◮ randomness = incompressibility (maximal complexity) ◮ ω = ω 1 ω 2 . . . is random iff KM ( ω 1 . . . ω n ) ≥ n − O (1) ◮ Classical probability theory: random sequence satisfies the Strong Law of Large Numbers with probability 1 ◮ Algorithmic version: every (algorithmically) random sequence satisfies SLLN . . . . . .

Foundations of probability theory ◮ Random object or random process? ◮ “well shuffled deck of cards”: any meaning? [xkcd cartoon] ◮ randomness = incompressibility (maximal complexity) ◮ ω = ω 1 ω 2 . . . is random iff KM ( ω 1 . . . ω n ) ≥ n − O (1) ◮ Classical probability theory: random sequence satisfies the Strong Law of Large Numbers with probability 1 ◮ Algorithmic version: every (algorithmically) random sequence satisfies SLLN ◮ algorithmic ⇒ classical: Martin-Löf random sequences form a set of measure 1 . . . . . . .

A device that (being switched on) produces N -bit string and stops “The device produces a random string”: what does it mean? classical: the output distribution is close to the uniform one effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) . . . . . .

“The device produces a random string”: what does it mean? classical: the output distribution is close to the uniform one effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops . . . . . .

classical: the output distribution is close to the uniform one effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? . . . . . .

effective: with high probability the output string is incompressible not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one . . . . . .

not equivalent if no assumptions about the device but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one ◮ effective: with high probability the output string is incompressible . . . . . .

but are related under some assumptions Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one ◮ effective: with high probability the output string is incompressible ◮ not equivalent if no assumptions about the device . . . . . .

Sampling random strings (S.Aaronson) ◮ A device that (being switched on) produces N -bit string and stops ◮ “The device produces a random string”: what does it mean? ◮ classical: the output distribution is close to the uniform one ◮ effective: with high probability the output string is incompressible ◮ not equivalent if no assumptions about the device ◮ but are related under some assumptions . . . . . .

k k minor of n n Boolean matrix: select k rows and k columns minor is uniform if it is all-0 or all-1. claim: there is a n n bit matrix without k k uniform minors for k log n . Example: matrices without uniform minors . . . . . .

minor is uniform if it is all-0 or all-1. claim: there is a n n bit matrix without k k uniform minors for k log n . Example: matrices without uniform minors ◮ k × k minor of n × n Boolean matrix: select k rows and k columns . . . . . .

Example: matrices without uniform minors ◮ k × k minor of n × n Boolean matrix: select k rows and k columns ◮ minor is uniform if it is all-0 or all-1. ◮ claim: there is a n × n bit matrix without k × k uniform minors for k = 3 log n . . . . . . .

n k positions of the minor [ k n k log n ] types of uniform minors (0/1) k possibilities for the rest n k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation . . . . . .

types of uniform minors (0/1) k possibilities for the rest n k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] . . . . . .

k possibilities for the rest n k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) . . . . . .

k = n . n k n log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest . . . . . .

n . log n log n n log n log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = . . . . . .

log n bits to specify a column or row: k log n bits in total one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . . . . . . .

one additional bit to specify the type of minor ( / ) n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total . . . . . .

n k bits to specify the rest of the matrix k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total ◮ one additional bit to specify the type of minor ( 0 / 1 ) . . . . . .

k log n n k log n n log n n . Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total ◮ one additional bit to specify the type of minor ( 0 / 1 ) ◮ n 2 − k 2 bits to specify the rest of the matrix . . . . . .

Counting argument and complexity reformulation ◮ ≤ n k × n k positions of the minor [ k = 3 log n ] ◮ 2 types of uniform minors (0/1) ◮ 2 n 2 − k 2 possibilities for the rest ◮ n 2 k × 2 × 2 n 2 − k 2 = 2 log n × 2 × 3 log n +1+( n 2 − 9 log 2 n ) < 2 n 2 . ◮ log n bits to specify a column or row: 2 k log n bits in total ◮ one additional bit to specify the type of minor ( 0 / 1 ) ◮ n 2 − k 2 bits to specify the rest of the matrix ◮ 2 k log n + 1 + ( n 2 − k 2 ) = 6 log 2 n + 1 + ( n 2 − 9 log 2 n ) < n 2 . . . . . . .

copying n -bit string on 1-tape TM requires n time complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O n d . K u t O n d proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u t so the length of it should be K u t ; the sum of lengths does not exceed time One-tape Turing machines . . . . . .

complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O n d . K u t O n d proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u t so the length of it should be K u t ; the sum of lengths does not exceed time One-tape Turing machines ◮ copying n -bit string on 1-tape TM requires Ω( n 2 ) time . . . . . .

proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u t so the length of it should be K u t ; the sum of lengths does not exceed time One-tape Turing machines ◮ copying n -bit string on 1-tape TM requires Ω( n 2 ) time ◮ complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O ( n / d ) . K ( u ( t )) ≤ O ( n / d ) . . . . . .

One-tape Turing machines ◮ copying n -bit string on 1-tape TM requires Ω( n 2 ) time ◮ complexity version: if initially the tape was empty on the right of the border, then after n steps the complexity of a zone that is d cells far from the border is O ( n / d ) . K ( u ( t )) ≤ O ( n / d ) ◮ proof: border guards in each cell of the border security zone write down the contents of the head of TM; each of the records is enough to reconstruct u ( t ) so the length of it should be Ω( K ( u ( t )) ; the sum of lengths does not exceed time . . . . . .

Random sequence has n -bit prefix of complexity n but some factors (substrings) have small complexity Levin: there exist everywhere complex sequences: every n -bit substring has complexity n O Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences . . . . . .

but some factors (substrings) have small complexity Levin: there exist everywhere complex sequences: every n -bit substring has complexity n O Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n . . . . . .

Levin: there exist everywhere complex sequences: every n -bit substring has complexity n O Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity . . . . . .

Combinatorial equivalent: Let F be a set of strings that has at n strings of length n . Then there is a sequence most s.t. all sufficiently long substrings of are not in F . combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity ◮ Levin: there exist everywhere complex sequences: every n -bit substring has complexity 0 . 99 n − O (1) . . . . . .

combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity ◮ Levin: there exist everywhere complex sequences: every n -bit substring has complexity 0 . 99 n − O (1) ◮ Combinatorial equivalent: Let F be a set of strings that has at most 2 0 . 99 n strings of length n . Then there is a sequence ω s.t. all sufficiently long substrings of ω are not in F . . . . . . .

Everywhere complex sequences ◮ Random sequence has n -bit prefix of complexity n ◮ but some factors (substrings) have small complexity ◮ Levin: there exist everywhere complex sequences: every n -bit substring has complexity 0 . 99 n − O (1) ◮ Combinatorial equivalent: Let F be a set of strings that has at most 2 0 . 99 n strings of length n . Then there is a sequence ω s.t. all sufficiently long substrings of ω are not in F . ◮ combinatorial and complexity proofs not just translations of each other (Lovasz lemma, Rumyantsev, Miller, Muchnik) . . . . . .

coding theory: how many n -bit strings x x k one can find if Hamming distance between every two is at least d lower bound (Gilbert–Varshamov) then d changed bits are harmless but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound . . . . . .

lower bound (Gilbert–Varshamov) then d changed bits are harmless but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d . . . . . .

then d changed bits are harmless but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) . . . . . .

but bit insertion or deletions could be general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless . . . . . .

general requirement: C x i x j d generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless ◮ but bit insertion or deletions could be . . . . . .

generalization of GV bound: d -separated family of size n d Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless ◮ but bit insertion or deletions could be ◮ general requirement: C ( x i | x j ) ≥ d . . . . . .

Gilbert-Varshamov complexity bound ◮ coding theory: how many n -bit strings x 1 , . . . , x k one can find if Hamming distance between every two is at least d ◮ lower bound (Gilbert–Varshamov) ◮ then < d /2 changed bits are harmless ◮ but bit insertion or deletions could be ◮ general requirement: C ( x i | x j ) ≥ d ◮ generalization of GV bound: d -separated family of size Ω(2 n − d ) . . . . . .

C x y C x C y x O log C x y C x C y x O log C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation . . . . . .

C x y C x C y x O log C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) . . . . . .

C x y k l C x k O log or C y x l O log l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) ≥ C ( x ) + C ( y | x ) + O ( log ) . . . . . .

l can be split into two parts k every set A of size k and h A l A A A such that w A Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) ≥ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) < k + l ⇒ C ( x ) < k + O ( log ) or C ( y | x ) < l + O ( log ) . . . . . .

Inequalities for complexities and combinatorial interpretation ◮ C ( x , y ) ≤ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) ≥ C ( x ) + C ( y | x ) + O ( log ) ◮ C ( x , y ) < k + l ⇒ C ( x ) < k + O ( log ) or C ( y | x ) < l + O ( log ) ◮ every set A of size < 2 k + l can be split into two parts A = A 1 ∪ A 2 such that w ( A 1 ) ≤ 2 k and h ( A 2 ) ≤ 2 l . . . . . .

C x y z C x y C y z C x z V S S S Also for Shannon entropies; special case of Shearer lemma One more inequality . . . . . .

V S S S Also for Shannon entropies; special case of Shearer lemma One more inequality ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) . . . . . .

Also for Shannon entropies; special case of Shearer lemma One more inequality ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) ◮ V 2 ≤ S 1 × S 2 × S 3 . . . . . .

One more inequality ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) ◮ V 2 ≤ S 1 × S 2 × S 3 ◮ Also for Shannon entropies; special case of Shearer lemma . . . . . .

mutual information: I a b C a C b C a b common information: combinatorial: graph minors can the graph be covered by minors of size ? Common information and graph minors . . . . . .

common information: combinatorial: graph minors can the graph be covered by minors of size ? Common information and graph minors ◮ mutual information: I ( a : b ) = C ( a ) + C ( b ) − C ( a , b ) . . . . . .

combinatorial: graph minors can the graph be covered by minors of size ? Common information and graph minors ◮ mutual information: I ( a : b ) = C ( a ) + C ( b ) − C ( a , b ) ◮ common information: . . . . . .

Common information and graph minors ◮ mutual information: I ( a : b ) = C ( a ) + C ( b ) − C ( a , b ) ◮ common information: ◮ combinatorial: graph minors can the graph be covered by 2 δ minors of size 2 α − δ × 2 β − δ ? . . . . . .

nonuniformity= (maximal section)/(average section) Theorem: every set of N elements can be represented as union of polylog N sets whose nonuniformity is polylog N . multidimensional version how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets . . . . . .

Theorem: every set of N elements can be represented as union of polylog N sets whose nonuniformity is polylog N . multidimensional version how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) . . . . . .

multidimensional version how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . . . . . . .

how to construct parts using Kolmogorov complexity: take strings with given complexity bounds so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version . . . . . .

so simple that it is not clear what is the combinatorial translation but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version ◮ how to construct parts using Kolmogorov complexity: take strings with given complexity bounds . . . . . .

but combinatorial argument exists (and gives even a stronger result) Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version ◮ how to construct parts using Kolmogorov complexity: take strings with given complexity bounds ◮ so simple that it is not clear what is the combinatorial translation . . . . . .

Almost uniform sets ◮ nonuniformity= (maximal section)/(average section) ◮ Theorem: every set of N elements can be represented as union of polylog ( N ) sets whose nonuniformity is polylog ( N ) . ◮ multidimensional version ◮ how to construct parts using Kolmogorov complexity: take strings with given complexity bounds ◮ so simple that it is not clear what is the combinatorial translation ◮ but combinatorial argument exists (and gives even a stronger result) . . . . . .

is a random variable; k values, probabilities p p k N : N independent trials of Shannon’s informal question: how many bits are needed to N ? encode a “typical” value of Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem . . . . . .

N : N independent trials of Shannon’s informal question: how many bits are needed to N ? encode a “typical” value of Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k . . . . . .

Shannon’s informal question: how many bits are needed to N ? encode a “typical” value of Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ . . . . . .

Shannon’s answer: NH , where H p log p p n log p n formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? . . . . . .

formal statement is a bit complicated N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? ◮ Shannon’s answer: NH ( ξ ) , where H ( ξ ) = p 1 log (1/ p 1 ) + . . . + p n log (1/ p n ) . . . . . . .

N has Complexity version: with high probablity the value of complexity close to NH . Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? ◮ Shannon’s answer: NH ( ξ ) , where H ( ξ ) = p 1 log (1/ p 1 ) + . . . + p n log (1/ p n ) . ◮ formal statement is a bit complicated . . . . . .

Shannon coding theorem ◮ ξ is a random variable; k values, probabilities p 1 , . . . , p k ◮ ξ N : N independent trials of ξ ◮ Shannon’s informal question: how many bits are needed to encode a “typical” value of ξ N ? ◮ Shannon’s answer: NH ( ξ ) , where H ( ξ ) = p 1 log (1/ p 1 ) + . . . + p n log (1/ p n ) . ◮ formal statement is a bit complicated ◮ Complexity version: with high probablity the value of ξ N has complexity close to NH ( ξ ) . . . . . . .

C x y z C x y C y z C x z O log The same for entropy: H H H H …and even for the sizes of subgroups U V W of some finite group G : log G U V W log G U V log G U W log G V W . in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size . . . . . .

The same for entropy: H H H H …and even for the sizes of subgroups U V W of some finite group G : log G U V W log G U V log G U W log G V W . in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) . . . . . .

…and even for the sizes of subgroups U V W of some finite group G : log G U V W log G U V log G U W log G V W . in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) . . . . . .

in all three cases inequalities are the same (Romashchenko, Chan, Yeung) some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . . . . . . .

some of them are quite strange: I a b I a b c I a b d I c d I a b e I a e b I b e a Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . ◮ in all three cases inequalities are the same (Romashchenko, Chan, Yeung) . . . . . .

Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a b e . Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . ◮ in all three cases inequalities are the same (Romashchenko, Chan, Yeung) ◮ some of them are quite strange: I ( a : b ) ≤ ≤ I ( a : b | c )+ I ( a : b | d )+ I ( c : d )+ I ( a : b | e )+ I ( a : e | b )+ I ( b : e | a ) . . . . . .

Complexity, entropy and group size ◮ 2 C ( x , y , z ) ≤ C ( x , y ) + C ( y , z ) + C ( x , z ) + O ( log ) ◮ The same for entropy: 2 H ( ξ, η, τ ) ≤ H ( ξ, η ) + H ( ξ, τ ) + H ( η, τ ) ◮ …and even for the sizes of subgroups U , V , W of some finite group G : 2 log ( | G | / | U ∩ V ∩ W | ) ≤ log ( | G | / | U ∩ V | ) + log ( | G | / | U ∩ W | ) + log ( | G | / | V ∩ W | ) . ◮ in all three cases inequalities are the same (Romashchenko, Chan, Yeung) ◮ some of them are quite strange: I ( a : b ) ≤ ≤ I ( a : b | c )+ I ( a : b | d )+ I ( c : d )+ I ( a : b | e )+ I ( a : e | b )+ I ( b : e | a ) ◮ Related to Romashchenko’s theorem: if three last terms are zeros, one can extract common information from a , b , e . . . . . . .

a b : two strings we look for a program p that maps a to b by definition C p is at least C b a but could be higher there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf . . . . . .

we look for a program p that maps a to b by definition C p is at least C b a but could be higher there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf ◮ a , b : two strings . . . . . .

by definition C p is at least C b a but could be higher there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf ◮ a , b : two strings ◮ we look for a program p that maps a to b . . . . . .

there exist p a b that is simple relative to b , e.g., “map everything to b” Muchnik theorem: it is possible to combine these two conditions: there exists p a b such that C p C b a and C p b information theory analog: Wolf–Slepian similar technique was developed by Fortnow and Laplante (randomness extractors) (Romashchenko, Musatov): how to use explicit extractors and derandomization to get space-bounded versions Muchnik and Slepian–Wolf ◮ a , b : two strings ◮ we look for a program p that maps a to b ◮ by definition C ( p ) is at least C ( b | a ) but could be higher . . . . . .

Kolmogorov complexity as a language Alexander Shen LIF CNRS, - PowerPoint PPT Presentation

CSR-2011 Kolmogorov complexity as a language Alexander Shen LIF CNRS, Marseille; on leave from , . . . . . . A powerful tool Just a way to reformulate arguments three languages:

Kolmogorov Complexity of Categories Complexity Programing Language Kolmogorov Noson S.

Kolmogorov-Loveland stochasticity and Kolmogorov complexity Laurent Bienvenu Laboratoire

CISC 876: Kolmogorov Complexity Neil Conway March 27, 2007 Neil Conway CISC 876: Kolmogorov

Kolmogorov complexity of 2D sequences Bruno Durand Laboratoire dInformatique Fondamentale de

preliminaries: Kolmogorov complexity U(p) = T i (p) K(x) = min p { p : U(p) = x }

On the Kolmogorov Complexity of Continuous Real Functions Amin Farjudian Division of Computer

Polynomial time algorithms in Kolmogorov complexity theory Marius Zimand Towson University CCR

CDM Program Size Complexity Klaus Sutner Carnegie Mellon University kolmogorov 2018/2/8 22:58

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Monotone complexity of a pair Pavel Karpovich (Moscow State University) CSR2010 Kolmogorov

Kolmogorov goodness-of-fit test ! for -symmetric distributions S in climate and weather

Improving global stability analysis of Kolmogorov fl ows using enstrophy Yue-Kin Tsang School of

Operators of Kolmogorov type and parabolic operators associated with non-commuting vector fields:

Kolmogorov equations and applications to path dependent derivatives Andrea Pascucci University

Kolmogorov Complexity Suppose I say I tossed a coin 40 times and got:

Randomness and Intractability in Kolmogorov Complexity Igor Carboni Oliveira University of Oxford

18.175: Lecture 11 Independent sums and large deviations Scott Sheffield MIT 1 18.175 Lecture 11

Modeling the cognitive spatio-temporal operations using associative memories and multiplicative

Machine Translation Research in META-NET Jan Haji Institute of Formal and Applied Linguistics

Machine Learning in Reservoir Production Simulation and Forecast Serge A. Terekhov NeurOK

trt t ttst r

On Essentially Conditional Information Inequalities Tarik Kaced 1 and Andrei Romashchenko 2 1 LIF

Harnack Chains and Control Problems in Hypoelliptic Partial Differential Equations Sergio

Two main cases Systems break down because of cumulative effect of shocks; extreme