Rank of tensors of l-out-of-k functions: an application in probabilistic inference Jiˇ r´ ı Vomlel Institute of Information Theory and Automation (´ UTIA) Academy of Sciences of the Czech Republic
Contents • The computer game of Minesweeper
Contents • The computer game of Minesweeper • Probabilistic reasoning given evidence (using a simple example)
Contents • The computer game of Minesweeper • Probabilistic reasoning given evidence (using a simple example) • Improving the computational efficiency
Contents • The computer game of Minesweeper • Probabilistic reasoning given evidence (using a simple example) • Improving the computational efficiency • Rank-one decomposition of probability tables representing addition
Contents • The computer game of Minesweeper • Probabilistic reasoning given evidence (using a simple example) • Improving the computational efficiency • Rank-one decomposition of probability tables representing addition • Results of experiments
The game of Minesweeper
Bayesian network for the game of Minesweeper ℓ ? ? ?
Bayesian network for the game of Minesweeper ℓ ? Y X 1 ? ? X 3 X 2
Bayesian network for the game of Minesweeper ℓ ? Y X 1 ? ? X 3 X 2 � 1 if ℓ = x 1 + x 2 + x 3 P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) = 0 otherwise.
Bayesian network for the game of Minesweeper ℓ ? Y X 1 ? ? X 3 X 2 � 1 if ℓ = x 1 + x 2 + x 3 P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) = 0 otherwise. r P ( X i ) = s · t − o r is the number of mines, o is the number of observations s , t are the dimensions of the game grid.
Bayes rule for updating probabilities • Assume we observe Y = ℓ .
Bayes rule for updating probabilities • Assume we observe Y = ℓ . • We compute by Bayes rule P ( X 1 = x 1 , X 2 = x 2 , X 3 = x 3 | Y = ℓ )
Bayes rule for updating probabilities • Assume we observe Y = ℓ . • We compute by Bayes rule P ( X 1 = x 1 , X 2 = x 2 , X 3 = x 3 | Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) · � 3 i = 1 P ( X i = x i ) = P ( Y = ℓ )
Bayes rule for updating probabilities • Assume we observe Y = ℓ . • We compute by Bayes rule P ( X 1 = x 1 , X 2 = x 2 , X 3 = x 3 | Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) · � 3 i = 1 P ( X i = x i ) = P ( Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) ∝
Bayes rule for updating probabilities • Assume we observe Y = ℓ . • We compute by Bayes rule P ( X 1 = x 1 , X 2 = x 2 , X 3 = x 3 | Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) · � 3 i = 1 P ( X i = x i ) = P ( Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) ∝ • This is a probability table over 3 binary variables X 1 , X 2 , X 3 : P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) � 1 if x 1 + x 2 + x 3 = ℓ = 0 otherwise.
Bayes rule for updating probabilities • Assume we observe Y = ℓ . • We compute by Bayes rule P ( X 1 = x 1 , X 2 = x 2 , X 3 = x 3 | Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) · � 3 i = 1 P ( X i = x i ) = P ( Y = ℓ ) P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) ∝ • This is a probability table over 3 binary variables X 1 , X 2 , X 3 : P ( Y = ℓ | X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) � 1 if x 1 + x 2 + x 3 = ℓ = 0 otherwise. ψ ( X 1 = x 1 , X 2 = x 2 , X 3 = x 3 ) . =
Tensors of ℓ -out-of- k functions We can visualize probability table ψ as a tensor (for ℓ = 1): � 0 � 1 � � 1 0 � 1 � 0 � � 0 0 In this talk all tensors are functions from { 0, 1 } k to real numbers.
Tensors of ℓ -out-of- k functions We can visualize probability table ψ as a tensor (for ℓ = 1): � 0 � 1 � � 1 0 � 1 � 0 � � 0 0 In this talk all tensors are functions from { 0, 1 } k to real numbers. We are interested in tensors of ℓ -out-of- k functions f ℓ ( x 1 , . . . , x k ) , where: • ℓ is the observed state of Y and • k is the number of binary variables - parents of Y .
Tensors of ℓ -out-of- k functions We can visualize probability table ψ as a tensor (for ℓ = 1): � 0 � 1 � � 1 0 � 1 � 0 � � 0 0 In this talk all tensors are functions from { 0, 1 } k to real numbers. We are interested in tensors of ℓ -out-of- k functions f ℓ ( x 1 , . . . , x k ) , where: • ℓ is the observed state of Y and • k is the number of binary variables - parents of Y . if ℓ = � k � 1 i = 1 x i f ℓ ( x 1 , . . . , x k ) = 0 otherwise.
Tensors of ℓ -out-of- k functions We can visualize probability table ψ as a tensor (for ℓ = 1): � 0 � 1 � � 1 0 � 1 � 0 � � 0 0 In this talk all tensors are functions from { 0, 1 } k to real numbers. We are interested in tensors of ℓ -out-of- k functions f ℓ ( x 1 , . . . , x k ) , where: • ℓ is the observed state of Y and • k is the number of binary variables - parents of Y . if ℓ = � k � 1 i = 1 x i f ℓ ( x 1 , . . . , x k ) = 0 otherwise. In our example ℓ = 1 and k = 3.
Combining information 0 ? 1 ? ? ? ? ?
Combining information Y 1 X 1 Y 2 X 6 X 3 X 2 X 4 X 5
Combining information X 1 X 6 X 3 X 2 X 4 X 5 ξ ( X 1 , . . . , X 6 ) = ψ ( X 1 , . . . , X 3 ) · ϕ ( X 1 , X 2 , X 4 , . . . , X 6 )
Combining information X 1 X 6 X 3 X 2 X 4 X 5 ξ ( X 1 , . . . , X 6 ) = ψ ( X 1 , . . . , X 3 ) · ϕ ( X 1 , X 2 , X 4 , . . . , X 6 ) Total table size is 2 3 + 2 5 = 8 + 32 = 40.
A more efficient way of combining information X 1 X 6 X 3 X 2 X 4 X 5 ξ ( X 1 , . . . , X 6 ) = ψ 1 ( X 1 ) · . . . · ψ 3 ( X 3 ) · ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 )
A more efficient way of combining information X 1 X 6 X 3 X 2 X 4 X 5 ξ ( X 1 , . . . , X 6 ) = ψ 1 ( X 1 ) · . . . · ψ 3 ( X 3 ) · ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) Total table size is 3 · 2 + 2 5 = 6 + 32 = 38.
An even more efficient way of combining information X 1 B 2 X 6 X 3 X 2 X 4 X 5 � ξ ( X 1 , . . . , X 6 ) = ψ 1 ( X 1 ) · . . . · ψ 3 ( X 3 ) B 2 · ϕ 1 ( B 2 , X 1 ) · ϕ 2 ( B 2 , X 2 ) · ϕ 4 ( B 2 , X 4 ) . . . ϕ 6 ( B 2 , X 6 )
An even more efficient way of combining information X 1 B 2 X 6 X 3 X 2 X 4 X 5 � ξ ( X 1 , . . . , X 6 ) = ψ 1 ( X 1 ) · . . . · ψ 3 ( X 3 ) B 2 · ϕ 1 ( B 2 , X 1 ) · ϕ 2 ( B 2 , X 2 ) · ϕ 4 ( B 2 , X 4 ) . . . ϕ 6 ( B 2 , X 6 ) Since B is binary the total table size is 3 · 2 + 5 · 2 2 = 6 + 20 = 26.
Tensor rank We have just seen that ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) � = ϕ 1 ( B 2 , X 1 ) · ϕ 2 ( B 2 , X 2 ) · ϕ 4 ( B 2 , X 4 ) . . . ϕ 6 ( B 2 , X 6 ) . B 2
Tensor rank We have just seen that ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) � = ϕ 1 ( B 2 , X 1 ) · ϕ 2 ( B 2 , X 2 ) · ϕ 4 ( B 2 , X 4 ) . . . ϕ 6 ( B 2 , X 6 ) . B 2 But there is no way we can write ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) = ϕ 1 ( X 1 ) · ϕ 2 ( X 2 ) · ϕ 4 ( X 4 ) . . . ϕ 6 ( X 6 )
Tensor rank We have just seen that ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) � = ϕ 1 ( B 2 , X 1 ) · ϕ 2 ( B 2 , X 2 ) · ϕ 4 ( B 2 , X 4 ) . . . ϕ 6 ( B 2 , X 6 ) . B 2 But there is no way we can write ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) = ϕ 1 ( X 1 ) · ϕ 2 ( X 2 ) · ϕ 4 ( X 4 ) . . . ϕ 6 ( X 6 ) What is the minimal number of states of a variable B so that it holds that k � � ψ ( X 1 , . . . , X k ) = ψ i ( B , X i ) ? B i = 1
Tensor rank We have just seen that ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) � = ϕ 1 ( B 2 , X 1 ) · ϕ 2 ( B 2 , X 2 ) · ϕ 4 ( B 2 , X 4 ) . . . ϕ 6 ( B 2 , X 6 ) . B 2 But there is no way we can write ϕ 1 ( X 1 , X 2 , X 4 , . . . , X 6 ) = ϕ 1 ( X 1 ) · ϕ 2 ( X 2 ) · ϕ 4 ( X 4 ) . . . ϕ 6 ( X 6 ) What is the minimal number of states of a variable B so that it holds that k � � ψ ( X 1 , . . . , X k ) = ψ i ( B , X i ) ? B i = 1 This number is called the rank of tensor ψ .
Symmetric rank of tensors of ℓ -out-of- k functions • Generally, finding the rank of a tensor is NP-hard.
Symmetric rank of tensors of ℓ -out-of- k functions • Generally, finding the rank of a tensor is NP-hard. • However, tensors of ℓ -out-of- k functions define a restricted class of tensors.
Symmetric rank of tensors of ℓ -out-of- k functions • Generally, finding the rank of a tensor is NP-hard. • However, tensors of ℓ -out-of- k functions define a restricted class of tensors. • These tensors are all symmetric. A tensor ψ is symmetric if ψ ( X 1 = x 1 , . . . , X k = x k ) = a x 1 + ... + x k where a = ( a 0 , . . . , a k ) is a vector of real numbers.
Symmetric rank of tensors of ℓ -out-of- k functions • Generally, finding the rank of a tensor is NP-hard. • However, tensors of ℓ -out-of- k functions define a restricted class of tensors. • These tensors are all symmetric. A tensor ψ is symmetric if ψ ( X 1 = x 1 , . . . , X k = x k ) = a x 1 + ... + x k where a = ( a 0 , . . . , a k ) is a vector of real numbers. • The symmetric rank of tensor ψ is the minimum number of symmetric tensors of rank one that sum up to ψ .
Recommend
More recommend