project proposal prediction by compression
play

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech - PowerPoint PPT Presentation

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018 March 30, 2018 Compressor C such that C ( s ) is the length of the


  1. Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018 March 30, 2018

  2. Compressor C such that C ( s ) is the length of the compression of s [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  3. Compressor C such that C ( s ) is the length of the compression of s s and t share all information = ⇒ C ( st ) ≈ C ( s )+ b s and t share no information = ⇒ C ( st ) ≈ C ( s )+ C ( t ) [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  4. Compressor C such that C ( s ) is the length of the compression of s s and t share all information = ⇒ C ( st ) ≈ C ( s )+ b s and t share no information = ⇒ C ( st ) ≈ C ( s )+ C ( t ) NCD C ( s , t ) = C ( st ) − min( C ( s ) , C ( t )) max( C ( s ) , C ( t )) [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  5. Compressor C such that C ( s ) is the length of the compression of s s and t share all information = ⇒ C ( st ) ≈ C ( s )+ b s and t share no information = ⇒ C ( st ) ≈ C ( s )+ C ( t ) NCD C ( s , t ) = C ( st ) − min( C ( s ) , C ( t )) max( C ( s ) , C ( t )) Under reasonable conditions for C , NCD c approximates a metric [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  6. Let P be the set of valid programs for programming language L [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  7. Let P be the set of valid programs for programming language L Kolmogorov complexity K : K ( s ) = argmin | p | p ∈ P ∧ L ( p )= s [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  8. Let P be the set of valid programs for programming language L Kolmogorov complexity K : K ( s ) = argmin | p | p ∈ P ∧ L ( p )= s NCD K ( s , t ) = K ( st ) − min( K ( s ) , K ( t )) max( K ( s ) , K ( t )) NCD K is the distance metric: ∀ d , s , t computable ( d ) ⇒ NCD K ( s , t ) ≤ d ( s , t ) [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

  9. No domain-specific knowledge necessary!

  10. No domain-specific knowledge necessary! � � � �

  11. No domain-specific knowledge necessary! � � � � �

  12. Problem: Mathematical statements are short

  13. Problem: Mathematical statements are short Compression: Prediction by Partial Matching

  14. Problem: Mathematical statements are short Compression: Prediction by Partial Matching Compress entire proof states

  15. a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c [CKaliszyk, Urban and Vyskoci 2015]

  16. a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c [CKaliszyk, Urban and Vyskoci 2015]

  17. a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c a b ¬ a c [CKaliszyk, Urban and Vyskoci 2015]

  18. a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c a b ¬ a c “ a ∧ b ¬ a ∨ c ¬ bab ¬ ac ” [CKaliszyk, Urban and Vyskoci 2015]

  19. a ∧ b Database ¬ a ∨ c ��������� ��������� ��������� ¬ c ∨¬ b ¬ b ⇔ ��������� ��������� ¬ c c ��������� ��������� a b ¬ a c “ a ∧ b ¬ a ∨ c ¬ bab ¬ ac ” [CKaliszyk, Urban and Vyskoci 2015]

  20. 1 choice available 0 . 8 best case compression percentage 0 . 6 0 . 4 random 0 . 2 0 2 4 6 8 10 k nearest neighbor

  21. 0 . 8 best case 0 . 75 compression percentage compression randomized compression reversed 0 . 7 feature based comparison 0 . 65 0 . 6 2 4 6 8 10 k nearest neighbor

  22. About 30-40 compressions per second No vector space: n compressions per prediction

  23. About 30-40 compressions per second No vector space: n compressions per prediction Idea: Impose structure through an n -dimensional lattice S n = { X ⊆ S | | X | = n } NCD ( t , u ) ∑ t , u ∈ X out ( s ) = argmax NCD ( s , t ) ∑ X ∈ S n t ∈ X

  24. Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states

  25. Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space

  26. Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space Ideas ⊲ Adapt the PPM compressor for tree-structures ⊲ Impose a n -dimensional lattice on the data

  27. Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space Ideas ⊲ Adapt the PPM compressor for tree-structures ⊲ Impose a n -dimensional lattice on the data ?

Recommend


More recommend