computationally efficient probabilistic inference with
play

Computationally efficient probabilistic inference with noisy - PowerPoint PPT Presentation

Computationally efficient probabilistic inference with noisy threshold models based on a CP tensor decomposition Jirka Vomlel and Petr Tichavsk y Institute of Information Theory and Automation ( UTIA) Academy of Sciences of the Czech


  1. Computationally efficient probabilistic inference with noisy threshold models based on a CP tensor decomposition Jirka Vomlel and Petr Tichavsk´ y Institute of Information Theory and Automation (´ UTIA) Academy of Sciences of the Czech Republic

  2. Contents • Motivation

  3. Contents • Motivation • Noisy threshold models

  4. Contents • Motivation • Noisy threshold models • CP-decomposition of conditional probability tables

  5. Contents • Motivation • Noisy threshold models • CP-decomposition of conditional probability tables • Experiments

  6. Contents • Motivation • Noisy threshold models • CP-decomposition of conditional probability tables • Experiments • Conclusions

  7. Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level

  8. Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level

  9. Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary

  10. Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary • conditional probability tables are noisy-or models

  11. Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary • conditional probability tables are noisy-or models X 3 X 1 X 2 X 4 X 5 X 6 Y 1 Y 2

  12. Quick Medical Reference - Decision Theoretic (QMR-DT) Miller et al. (1986) and Shwe et al. (1991). • 570 diseases in the first level • 4075 observations in the second level • all variables are binary • conditional probability tables are noisy-or models X 3 X 1 X 2 X 4 X 5 X 6 Y 1 Y 2 Definition (The inference task) Given a subset of observations (e.g. Y 1 and Y 2 ) compute probabilities of diseases (e.g. P ( X i | Y 1 = y 1 , Y 2 = y 2 ) , i = 1, . . . , 6.

  13. Noisy threshold - a generalization of noisy-or X 1 X 2 . . . X k X ′ X ′ X ′ . . . 1 2 k Y

  14. Noisy threshold - a generalization of noisy-or Y takes value 1 if at least ℓ out of k parents take value 1: P ( Y = 1 | X ′ 1 = x ′ 1 , . . . , X ′ k = x ′ k ) X 1 X 2 . . . X k � 1 if x ′ 1 + . . . + x ′ k � ℓ = 0 otherwise. X ′ X ′ X ′ . . . 1 2 k Y

  15. Noisy threshold - a generalization of noisy-or Y takes value 1 if at least ℓ out of k parents take value 1: P ( Y = 1 | X ′ 1 = x ′ 1 , . . . , X ′ k = x ′ k ) X 1 X 2 . . . X k � 1 if x ′ 1 + . . . + x ′ k � ℓ = 0 otherwise. Noise: for i = 1, . . . , k X ′ X ′ X ′ . . . 1 2 k P ( X ′ i = 1 | X i = x i ) � 0 if x i = 0 = otherwise. π i Y

  16. An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 )

  17. An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1

  18. An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1 � 1 � 1 � 1 � 0     � � � � 1 1 0 0 1 1 1 1 0 0 0 0     =  − � 1 � 1 � 0 � 0      � �   � �  1 1 0 0    1 1 1 1 0 0 0 0

  19. An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1 � 1 � 1 � 1 � 0     � � � � 1 1 0 0 1 1 1 1 0 0 0 0     =  − � 1 � 1 � 0 � 0      � �   � �  1 1 0 0    1 1 1 1 0 0 0 0 = ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) − ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 )

  20. An example for k = 4, ℓ = 1, and π i = 1, i = 1, . . . , k - i.e., for deterministic OR function P ( Y = 1 | X 1 = x 1 , . . . , X 4 = x 4 ) � 0 � 1   � � 1 1 1 1 1 1   = � 1 � 1    � �  1 1   1 1 1 1 � 1 � 1 � 1 � 0     � � � � 1 1 0 0 1 1 1 1 0 0 0 0     =  − � 1 � 1 � 0 � 0      � �   � �  1 1 0 0    1 1 1 1 0 0 0 0 = ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) ⊗ ( 1, 1 ) − ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 ) ⊗ ( 1, 0 ) ( 1, 1 ) ⊗ k − ( 1, 0 ) ⊗ k =

  21. Compilation of the threshold model for ℓ = 1 - the standard approach Lauritzen and Spiegelhalter (1988), Jensen et al. (1990), Shafer and Shenoy (1990) X 1 X 2 Y X 3 X 4

  22. Compilation of the threshold model for ℓ = 1 - the standard approach Lauritzen and Spiegelhalter (1988), Jensen et al. (1990), Shafer and Shenoy (1990) X 1 X 1 X 2 X 2 Y Y X 3 X 3 X 4 X 4

  23. Compilation of the threshold model for ℓ = 1 - the standard approach Lauritzen and Spiegelhalter (1988), Jensen et al. (1990), Shafer and Shenoy (1990) X 1 X 1 X 2 X 2 Y Y X 3 X 3 X 4 X 4 The total table size is 2 5 = 32.

  24. Compilation of the threshold model for ℓ = 1 - after the suggested decomposition D´ ıez and Gal´ an (2002), Vomlel (2002), Savick´ y and Vomlel (2007) X 1 X 2 Y X 3 X 4

  25. Compilation of the threshold model for ℓ = 1 - after the suggested decomposition D´ ıez and Gal´ an (2002), Vomlel (2002), Savick´ y and Vomlel (2007) X 1 X 1 X 2 X 2 Y B Y X 3 X 3 X 4 X 4

  26. Compilation of the threshold model for ℓ = 1 - after the suggested decomposition D´ ıez and Gal´ an (2002), Vomlel (2002), Savick´ y and Vomlel (2007) X 1 X 1 X 2 X 2 Y B Y X 3 X 3 X 4 X 4 The total table size is 5 · 2 2 = 20.

  27. Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) .

  28. Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2.

  29. Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2. • T ( ℓ , k ) is symmetric.

  30. Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2. • T ( ℓ , k ) is symmetric. Definition (Symmetric rank) Symmetric rank (srank) is the minimum number r such that r � b i · a ⊗ k T ( ℓ , k ) = i i = 1 where for i = 1, . . . , k : • b i ∈ R and • a i are real-valued vectors of length 2.

  31. Decomposition of T ( ℓ , k ) into sum of tensor products • P ( Y = 1 | X = x ) can be viewed as a tensor T ( ℓ , k ) . • All dimensions of T ( ℓ , k ) are equal to 2. • T ( ℓ , k ) is symmetric. Definition (Symmetric rank) Symmetric rank (srank) is the minimum number r such that r � b i · a ⊗ k T ( ℓ , k ) = i i = 1 where for i = 1, . . . , k : • b i ∈ R and • a i are real-valued vectors of length 2. • This decomposition is called Canonical Polyadic (CP) or CANDECOMP-PARAFAC (CP) or tensor rank-one .

  32. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1.

  33. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1.

  34. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2.

  35. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k .

  36. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k . • srank ( T ( ℓ , k )) � k for ℓ = 3, . . . , k − 2.

  37. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k . • srank ( T ( ℓ , k )) � k for ℓ = 3, . . . , k − 2. • An algorithm for CP-decomposition to k factors.

  38. Theoretical results Results in the proceedings: • srank ( T ( 0, k )) = 1. • srank ( T ( k , k )) = 1. • srank ( T ( 1, k )) = 2. • srank ( T ( k − 1, k )) = k . • srank ( T ( ℓ , k )) � k for ℓ = 3, . . . , k − 2. • An algorithm for CP-decomposition to k factors. • For the noisy threshold the above values represent upper bounds.

Recommend


More recommend