A safe & computable approximation to Kolmogorov complexity Peter Bloem, Francisco Mota, Steven de Rooij, Luìs Antunes & Pieter Adriaans
preliminaries: Kolmogorov complexity U(ıp) = T i (p) K(x) = min p { p : U(p) = x }
preliminaries: Kolmogorov complexity U(ıp) = T i (p) K(x) = min p { p : U(p) = x } U is a formalisation of the notion of a ӫ description K is invariant to the choice of U ӫ (up to a constant)
preliminaries: Kolmogorov complexity U(ıp) = T i (p) K(x) = min p { p : U(p) = x } U is a formalisation of the notion of a ӫ description K is invariant to the choice of U ӫ (up to a constant)
preliminaries: Kolmogorov complexity U(ıp) = T i (p) K(x) = min p { p : U(p) = x } U is a formalisation of the notion of a ӫ description K is invariant to the choice of U ӫ (up to a constant)
motivation “Kolmogorov is not computable, it’s only of theoretical use” No, approximations are usually correct
preliminaries: Probabilities & codes L(x) : (prefjx) code length function ӫ p(x) : probability (semi) measure ӫ - log p(x) = L(x)
step 1: computable probabilities From TMs to probabilities: T(p) = x p T (x) = Σ p:T(p) = x 2 -|p| m(x) = p U (x) equivalent to the lower semicomputable semimeasures
step 2: model classes A model class C is an effectively enumerable subset of all Turing machines. U C (ıp) = T i (p) K C (x) = min p { p : U C (p) = x } m C (x) = Σ p:U C (p) = x 2 -|p|
step 3: safe approximation L(x) : approximating ӫ code-length function L(x) is safe against p when ӫ p ( L(x) - K(x) ≥ k ) ≤ cb -k for some c and b > 1
Is K C safe against p∈C ?
no.
not x x x x x x x x x
- log m C (x) not x K C (x) x x x x x x x x
Is - log m C safe against p∈C ?
yes.
-log m C is safe against m C m C � − log m C ( x ) − K ( x ) � k � = m C � m C ( x ) � 2 − k 2 − K ( x ) � � x : m C ( x ) � 2 − k 2 − K ( x ) m C ( x ) = � 2 − k 2 − K ( x ) � 2 − K ( x ) � 2 − k = 2 − k �
-log m C is safe against m C m C � − log m C ( x ) − K ( x ) � k � = m C � m C ( x ) � 2 − k 2 − K ( x ) � � x : m C ( x ) � 2 − k 2 − K ( x ) m C ( x ) = � 2 − k 2 − K ( x ) � 2 − K ( x ) � 2 − k = 2 − k �
-log m C is safe against m C m C � − log m C ( x ) − K ( x ) � k � = m C � m C ( x ) � 2 − k 2 − K ( x ) � � x : m C ( x ) � 2 − k 2 − K ( x ) m C ( x ) = � 2 − k 2 − K ( x ) � 2 − K ( x ) � 2 − k = 2 − k �
-log m C is safe against m C m C � − log m C ( x ) − K ( x ) � k � = m C � m C ( x ) � 2 − k 2 − K ( x ) � � x : m C ( x ) � 2 − k 2 − K ( x ) m C ( x ) = � 2 − k 2 − K ( x ) � 2 − K ( x ) � 2 − k = 2 − k �
-log m C is safe against m C m C � − log m C ( x ) − K ( x ) � k � = m C � m C ( x ) � 2 − k 2 − K ( x ) � � x : m C ( x ) � 2 − k 2 − K ( x ) m C ( x ) = � 2 − k 2 − K ( x ) � 2 − K ( x ) � 2 − k = 2 − k �
-log m C is safe against m C m C � − log m C ( x ) − K ( x ) � k � = m C � m C ( x ) � 2 − k 2 − K ( x ) � � x : m C ( x ) � 2 − k 2 − K ( x ) m C ( x ) = � 2 − k 2 − K ( x ) � 2 − K ( x ) � 2 − k = 2 − k �
-log m C is safe against members of C � m C ( · ) = c q p q ( · ) � c q p q ( · ) q ∈ C − log m C ( x ) − K ( x ) � k � � c q p q � m C � − log m C ( x ) − K ( x ) � k � � 2 − k
-log m C is safe against members of C � m C ( · ) = c q p q ( · ) � c q p q ( · ) q ∈ C − log m C ( x ) − K ( x ) � k � � c q p q � m C � − log m C ( x ) − K ( x ) � k � � 2 − k
-log m C is safe against members of C � m C ( · ) = c q p q ( · ) � c q p q ( · ) q ∈ C − log m C ( x ) − K ( x ) � k � � c q p q � m C � − log m C ( x ) − K ( x ) � k � � 2 − k
-log m C is safe against members of C � m C ( · ) = c q p q ( · ) � c q p q ( · ) q ∈ C − log m C ( x ) − K ( x ) � k � � c q p q � m C � − log m C ( x ) − K ( x ) � k � � 2 − k
can we compute m C ? We can if it’s upper and lower ӫ semicomputable lower: dovetail all programs for U C ӫ upper: dovetail until ӫ (1-s)/s x ≤ 2 c − 1 If C is complete, this algorithm is com- ӫ putable
κ C (x) = κ C (x) = dominates 2-safe -log m(x) -log m C (x) -log m C (x) bounds dominates bounds bounds dominates unsafe K C (x) K(x) dominates
What does this buy us? bridge between the practical and the ӫ platonic Bayesian ↔ MDL ↔ Algorithmic ӫ corollary: K t ӫ Additional results: ID, NID ӫ
Questions?
Recommend
More recommend