Linear Dimensionality Reduction Linear Latent Variable Model ◮ Represent data, Y , with a lower dimensional set of latent variables X . ◮ Assume a linear relationship of the form y i , : = Wx i , : + ǫ i , : , where � � 0 , σ 2 I ǫ i , : ∼ N .
Linear Latent Variable Model Probabilistic PCA ◮ Define linear-Gaussian X relationship between W latent variables and data. σ 2 Y n � � � y i , : | Wx i , : , σ 2 I p ( Y | X , W ) = N i = 1
Linear Latent Variable Model Probabilistic PCA ◮ Define linear-Gaussian X relationship between W latent variables and data. σ 2 ◮ Standard Latent Y variable approach: n � � � y i , : | Wx i , : , σ 2 I p ( Y | X , W ) = N i = 1
Linear Latent Variable Model Probabilistic PCA X W ◮ Define linear-Gaussian relationship between latent variables and σ 2 Y data. ◮ Standard Latent variable approach: n ◮ Define Gaussian prior � � � y i , : | Wx i , : , σ 2 I p ( Y | X , W ) = N over latent space , X . i = 1 n � � � N x i , : | 0 , I p ( X ) = i = 1
Linear Latent Variable Model X W Probabilistic PCA ◮ Define linear-Gaussian relationship between σ 2 Y latent variables and data. ◮ Standard Latent n variable approach: � � � y i , : | Wx i , : , σ 2 I p ( Y | X , W ) = N ◮ Define Gaussian prior i = 1 over latent space , X . n � ◮ Integrate out latent � � p ( X ) = N x i , : | 0 , I variables . i = 1 n � y i , : | 0 , WW ⊤ + σ 2 I � � p ( Y | W ) = N i = 1
Computation of the Marginal Likelihood � � 0 , σ 2 I x i , : ∼ N ( 0 , I ) , ǫ i , : ∼ N y i , : = Wx i , : + ǫ i , : ,
Computation of the Marginal Likelihood � � 0 , σ 2 I x i , : ∼ N ( 0 , I ) , ǫ i , : ∼ N y i , : = Wx i , : + ǫ i , : , Wx i , : ∼ N � 0 , WW ⊤ � ,
Computation of the Marginal Likelihood � � 0 , σ 2 I x i , : ∼ N ( 0 , I ) , ǫ i , : ∼ N y i , : = Wx i , : + ǫ i , : , Wx i , : ∼ N � 0 , WW ⊤ � , 0 , WW ⊤ + σ 2 I � � Wx i , : + ǫ i , : ∼ N
Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) W σ 2 Y n � y i , : | 0 , WW ⊤ + σ 2 I � � p ( Y | W ) = N i = 1
Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) n N � y i , : | 0 , C � , � C = WW ⊤ + σ 2 I p ( Y | W ) = i = 1
Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) n N � y i , : | 0 , C � , � C = WW ⊤ + σ 2 I p ( Y | W ) = i = 1 log p ( Y | W ) = − n 2 log | C | − 1 � � C − 1 Y ⊤ Y 2tr + const.
Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) n N � y i , : | 0 , C � , � C = WW ⊤ + σ 2 I p ( Y | W ) = i = 1 log p ( Y | W ) = − n 2 log | C | − 1 � � C − 1 Y ⊤ Y 2tr + const. If U q are first q principal eigenvectors of n − 1 Y ⊤ Y and the corresponding eigenvalues are Λ q ,
Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) n N � y i , : | 0 , C � , � C = WW ⊤ + σ 2 I p ( Y | W ) = i = 1 log p ( Y | W ) = − n 2 log | C | − 1 � � C − 1 Y ⊤ Y 2tr + const. If U q are first q principal eigenvectors of n − 1 Y ⊤ Y and the corresponding eigenvalues are Λ q , � 1 � W = U q LR ⊤ , Λ q − σ 2 I 2 L = where R is an arbitrary rotation matrix.
Linear Latent Variable Model III Dual Probabilistic PCA ◮ Define linear-Gaussian W relationship between X latent variables and data. σ 2 Y n � � y i , : | Wx i , : , σ 2 I � p ( Y | X , W ) = N i = 1
Linear Latent Variable Model III Dual Probabilistic PCA ◮ Define linear-Gaussian W relationship between X latent variables and data. σ 2 ◮ Novel Latent variable Y approach: n � � y i , : | Wx i , : , σ 2 I � p ( Y | X , W ) = N i = 1
Linear Latent Variable Model III Dual Probabilistic PCA W X ◮ Define linear-Gaussian relationship between latent variables and σ 2 Y data. ◮ Novel Latent variable approach: n ◮ Define Gaussian prior � � y i , : | Wx i , : , σ 2 I � p ( Y | X , W ) = N over parameters , W . i = 1 p � � � p ( W ) = N w i , : | 0 , I i = 1
Linear Latent Variable Model III W X Dual Probabilistic PCA ◮ Define linear-Gaussian relationship between σ 2 Y latent variables and data. ◮ Novel Latent variable n approach: � � y i , : | Wx i , : , σ 2 I � p ( Y | X , W ) = N ◮ Define Gaussian prior i = 1 over parameters , W . p � ◮ Integrate out � � N w i , : | 0 , I p ( W ) = parameters . i = 1 p � � y : , j | 0 , XX ⊤ + σ 2 I � p ( Y | X ) = N j = 1
Computation of the Marginal Likelihood � � 0 , σ 2 I w : , j ∼ N ( 0 , I ) , ǫ i , : ∼ N y : , j = Xw : , j + ǫ : , j ,
Computation of the Marginal Likelihood � � 0 , σ 2 I w : , j ∼ N ( 0 , I ) , ǫ i , : ∼ N y : , j = Xw : , j + ǫ : , j , Xw : , j ∼ N � 0 , XX ⊤ � ,
Computation of the Marginal Likelihood � � 0 , σ 2 I w : , j ∼ N ( 0 , I ) , ǫ i , : ∼ N y : , j = Xw : , j + ǫ : , j , Xw : , j ∼ N � 0 , XX ⊤ � , 0 , XX ⊤ + σ 2 I � � Xw : , j + ǫ : , j ∼ N
Linear Latent Variable Model IV Dual Probabilistic PCA Max. Likelihood Soln (Lawrence, 2004, 2005) X σ 2 Y p � � y : , j | 0 , XX ⊤ + σ 2 I � p ( Y | X ) = N j = 1
Linear Latent Variable Model IV Dual PPCA Max. Likelihood Soln (Lawrence, 2004, 2005) p � � � K = XX ⊤ + σ 2 I p ( Y | X ) = N y : , j | 0 , K , j = 1
Linear Latent Variable Model IV PPCA Max. Likelihood Soln (Tipping and Bishop, 1999) p � � � K = XX ⊤ + σ 2 I p ( Y | X ) = N y : , j | 0 , K , j = 1 log p ( Y | X ) = − p 2 log | K | − 1 � K − 1 YY ⊤ � 2tr + const.
Linear Latent Variable Model IV PPCA Max. Likelihood Soln p � � � K = XX ⊤ + σ 2 I p ( Y | X ) = N y : , j | 0 , K , j = 1 log p ( Y | X ) = − p 2 log | K | − 1 � K − 1 YY ⊤ � 2tr + const. q are first q principal eigenvectors of p − 1 YY ⊤ and the If U ′ corresponding eigenvalues are Λ q ,
Linear Latent Variable Model IV PPCA Max. Likelihood Soln p � � � K = XX ⊤ + σ 2 I p ( Y | X ) = N y : , j | 0 , K , j = 1 log p ( Y | X ) = − p 2 log | K | − 1 � K − 1 YY ⊤ � 2tr + const. q are first q principal eigenvectors of p − 1 YY ⊤ and the If U ′ corresponding eigenvalues are Λ q , � 1 � X = U ′ q LR ⊤ , Λ q − σ 2 I 2 L = where R is an arbitrary rotation matrix.
Linear Latent Variable Model IV Dual PPCA Max. Likelihood Soln (Lawrence, 2004, 2005) p � K = XX ⊤ + σ 2 I � � p ( Y | X ) = N y : , j | 0 , K , j = 1 log p ( Y | X ) = − p 2 log | K | − 1 � K − 1 YY ⊤ � 2tr + const. q are first q principal eigenvectors of p − 1 YY ⊤ and the If U ′ corresponding eigenvalues are Λ q , � 1 X = U ′ q LR ⊤ , � Λ q − σ 2 I 2 L = where R is an arbitrary rotation matrix.
Linear Latent Variable Model IV PPCA Max. Likelihood Soln (Tipping and Bishop, 1999) n N � y i , : | 0 , C � , � C = WW ⊤ + σ 2 I p ( Y | W ) = i = 1 log p ( Y | W ) = − n 2 log | C | − 1 � � C − 1 Y ⊤ Y 2tr + const. If U q are first q principal eigenvectors of n − 1 Y ⊤ Y and the corresponding eigenvalues are Λ q , � 1 W = U q LR ⊤ , � Λ q − σ 2 I 2 L = where R is an arbitrary rotation matrix.
Equivalence of Formulations The Eigenvalue Problems are equivalent ◮ Solution for Probabilistic PCA (solves for the mapping) Y ⊤ YU q = U q Λ q W = U q LR ⊤ ◮ Solution for Dual Probabilistic PCA (solves for the latent positions) YY ⊤ U ′ q = U ′ X = U ′ q LR ⊤ q Λ q ◮ Equivalence is from − 1 U q = Y ⊤ U ′ q Λ 2 q
Gaussian Processes: Extremely Short Overview 6 4 2 0 -2 -4 -6 0 2 4 6 8 10
Gaussian Processes: Extremely Short Overview 6 4 2 0 -2 -4 -6 0 2 4 6 8 10
Gaussian Processes: Extremely Short Overview 6 4 2 0 -2 -4 -6 0 2 4 6 8 10
Gaussian Processes: Extremely Short Overview 6 6 4 4 2 2 0 0 -2 -2 -4 -4 -6 -6 0 2 4 6 8 10 0 2 4 6 8 10
Non-Linear Latent Variable Model W Dual Probabilistic PCA X ◮ Define linear-Gaussian relationship between σ 2 latent variables and Y data. ◮ Novel Latent variable n � � � y i , : | Wx i , : , σ 2 I p ( Y | X , W ) = N approach: i = 1 ◮ Define Gaussian prior p over parameteters , W . � � � N w i , : | 0 , I p ( W ) = ◮ Integrate out i = 1 parameters . p � � y : , j | 0 , XX ⊤ + σ 2 I � p ( Y | X ) = N j = 1
Non-Linear Latent Variable Model Dual Probabilistic PCA ◮ Inspection of the marginal likelihood W shows ... X σ 2 Y p � y : , j | 0 , XX ⊤ + σ 2 I � � p ( Y | X ) = N j = 1
Non-Linear Latent Variable Model Dual Probabilistic PCA ◮ Inspection of the W marginal likelihood X shows ... ◮ The covariance matrix is a covariance σ 2 Y function. p � � � p ( Y | X ) = N y : , j | 0 , K j = 1 K = XX ⊤ + σ 2 I
Non-Linear Latent Variable Model Dual Probabilistic PCA W ◮ Inspection of the X marginal likelihood shows ... σ 2 ◮ The covariance matrix Y is a covariance function. p ◮ We recognise it as the � � � p ( Y | X ) = N y : , j | 0 , K ‘linear kernel’. j = 1 K = XX ⊤ + σ 2 I This is a product of Gaussian processes with linear kernels.
Non-Linear Latent Variable Model Dual Probabilistic PCA W ◮ Inspection of the X marginal likelihood shows ... ◮ The covariance matrix σ 2 Y is a covariance function. p ◮ We recognise it as the � � � p ( Y | X ) = N y : , j | 0 , K ‘linear kernel’. j = 1 ◮ We call this the K = ? Gaussian Process Latent Variable model Replace linear kernel with non-linear (GP-LVM). kernel for non-linear model.
Non-linear Latent Variable Models Exponentiated Quadratic (EQ) Covariance � � ◮ The EQ covariance has the form k i , j = k x i , : , x j , : , where 2 � � � x i , : − x j , : � � � � � 2 k x i , : , x j , : = α exp − . 2 ℓ 2 ◮ No longer possible to optimise wrt X via an eigenvalue problem. ◮ Instead find gradients with respect to X , α, ℓ and σ 2 and optimise using conjugate gradients.
Outline Probabilistic Linear Dimensionality Reduction Non Linear Probabilistic Dimensionality Reduction Examples Conclusions
Applications Style Based Inverse Kinematics ◮ Facilitating animation through modeling human motion (Grochow et al., 2004) Tracking ◮ Tracking using human motion models (Urtasun et al., 2005, 2006) Assisted Animation ◮ Generalizing drawings for animation (Baxter and Anjyo, 2006) Shape Models ◮ Inferring shape (e.g. pose from silhouette). (Ek et al., 2008b,a; Priacuriu and Reid, 2011a,b)
Example: Latent Doodle Space (Baxter and Anjyo, 2006)
Example: Latent Doodle Space (Baxter and Anjyo, 2006) Generalization with much less Data than Dimensions ◮ Powerful uncertainly handling of GPs leads to surprising properties. ◮ Non-linear models can be used where there are fewer data points than dimensions without overfitting .
Prior for Supervised Learning (Urtasun and Darrell, 2007) ◮ We introduce a prior that is based on the Fisher criteria − 1 � � S − 1 p ( X ) ∝ exp tr w S b , σ 2 d with S b the between class matrix and S w the within class matrix
Prior for Supervised Learning (Urtasun and Darrell, 2007) ◮ We introduce a prior that is based on the Fisher criteria − 1 � � S − 1 p ( X ) ∝ exp tr w S b , σ 2 d with S b the between class matrix and S w the within class matrix L n i � n ( M i − M 0 )( M i − M 0 ) ⊤ S w = i = 1 where X ( i ) = [ x ( i ) 1 , · · · , x ( i ) n i ] are the n i training points of class i , M i is the mean of the elements of class i , and M 0 is the mean of all the training points of all classes.
Prior for Supervised Learning (Urtasun and Darrell, 2007) ◮ We introduce a prior that is based on the Fisher criteria − 1 � � S − 1 p ( X ) ∝ exp tr w S b , σ 2 d with S b the between class matrix and S w the within class matrix L n i � n ( M i − M 0 )( M i − M 0 ) ⊤ S w = i = 1 L n i n i 1 � � ( x ( i ) k − M i )( x ( i ) k − M i ) ⊤ S b = n n i i = 1 k = 1 where X ( i ) = [ x ( i ) 1 , · · · , x ( i ) n i ] are the n i training points of class i , M i is the mean of the elements of class i , and M 0 is the
Prior for Supervised Learning (Urtasun and Darrell, 2007) ◮ We introduce a prior that is based on the Fisher criteria − 1 � � S − 1 p ( X ) ∝ exp tr w S b , σ 2 d with S b the between class matrix and S w the within class matrix 1 0.5 0.6 0 0.4 − 1 0 0.2 − 2 0 − 3 0.2 0.5 − 4 0.4 − 5 0.6 − 4 − 2 0 2 − 1 5 − 1 − 0 5 0 0 5 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4
GaussianFace (Lu and Tang, 2014) ◮ First system to surpass human performance on cropped Learning Faces in Wild Data. http://tinyurl.com/nkt9a38 ◮ Lots of feature engineering, followed by a Discriminative GP-LVM. 1 0.98 0.96 0.94 true positive rate 0.92 0.9 0.88 High dimensional LBP (95.17%) [Chen et al. 2013] Fisher Vector Faces (93.03%) [Simonyan et al. 2013] 0.86 TL Joint Bayesian (96.33%) [Cao et al. 2013] Human, cropped (97.53%) [Kumar et al. 2009] 0.84 DeepFace-ensemble (97.35%) [Taigman et al. 2014] 0.82 ConvNet-RBM (92.52%) [Sun et al. 2013] Figure 5: The two rows present examples of matched and GaussianFace-FE + GaussianFace-BC (98.52%) 0.8 mismatched pairs respectively from LFW that were incorrectly 0 0.05 0.1 0.15 0.2 classified by the GaussianFace model. false positive rate Conclusion and Future Work Figure 4: The ROC curve on LFW. Our method achieves the best performance, beating human-level performance. This paper presents a principled Multi-Task Learning ap-
Continuous Character Control (Levine et al., 2012) ◮ Graph diffusion prior for enforcing connectivity between motions. � log K d log p ( X ) = w c ij i , j with the graph diffusion kernel K d obtain from K d H = − T − 1 / 2 LT − 1 / 2 ij = exp( β H ) with the graph Laplacian, and T is a diagonal matrix with T ii = � j w ( x i , x j ), � k w ( x i , x k ) if i = j L ij = − w ( x i , x j ) otherwise. and w ( x i , x j ) = || x i − x j || − p measures similarity.
Character Control: Results
GPLVM for Character Animation ◮ Learn a GPLVM from a small mocap sequence ◮ Pose synthesis by solving an optimization problem arg min x , y − log p ( y | x ) such that C ( y ) = 0 ◮ These handle constraints may come from a user in an interactive session, or from a mocap system. ◮ Smooth the latent space by adding noise in order to reduce the number of local minima. ◮ Optimization in an annealed fashion over different anneal version of the latent space.
Application: Replay same motion (Grochow et al., 2004)
Application: Keyframing joint trajectories (Grochow et al., 2004)
Application: Deal with missing data in mocap (Grochow et al., 2004)
Application: Style Interpolation (Grochow et al., 2004)
Shape Priors in Level Set Segmentation ◮ Represent contours with elliptic Fourier descriptors ◮ Learn a GPLVM on the parameters of those descriptors ◮ We can now generate close contours from the latent space ◮ Segmentation is done by non-linear minimization of an image-driven energy which is a function of the latent space
GPLVM on Contours [ V. Prisacariu and I. Reid, ICCV 2011]
Segmentation Results [ V. Prisacariu and I. Reid, ICCV 2011]
Recommend
More recommend