PCA Wrap Up Projection perspective Ii 132 B orthonormal again Want To Minimize error reconstruction Nt E I.li Jm xi What The for Coordinates 2 Xi are optimal B w.ro JI JI JI J2ji Iki d2ji II x sitter f CI zmibm b
So IJM x Htb f x I zmib f 0 if mtj bmt.bg big f f x b b x b 2 2 bjXi Set 0 to Zji for A of similar choice B C basis argument be made can The M yielding again largest Eigenvectors see reading
CSNE tochastic Embedding Neighbor low dim very Airn X y very high dim That conditional Define probability a encodes similarity xg.li 2o exPE llxi Pj xklf ZiexpE hxi zo Th is in dim high space i Xi Xu Xj The similarly in map y y If PE Nyi 9J IE expE Kyi yah Vij Pjli Ideally 9J
Formalize this KL KL with Divergence E qlxllogff.IT KLlqHp from p distribution 9 How different the is Properties KL gllp Z 0 i KL glp If 2 g p 3 KL pkg KL gmp f function OST fkLCQHP.it E c oPju.logPig 1 but Same Conditional distribution in M over all other points j i given in D find To C place MINIMIZE points TO y descent fly gradient Using C
SNE Symmetric Instead of Pilj 9ilj conditionals define distributions joint Pij 9ij PE Yi yjH i 9 j nY p 9ij 9ji For high dim outliers challenge a space pose be because will large denominator Pij small 4ij unimportant Instead Pig Piti Pig If Zn Pig Z In F txi nicer fly Yields a 4Filpij 9ij Yi Yj Thicsymmetric
The crowding problem lower Not enough space in dims far Ei o The In SNE model t we probability joint 9 ij distribution which Student T a Using heavier has distance moderate tails X distance ok 4 big is in in does force distances not moderate in X to yield distances small in y
Auto encoders that and X network consumes Design a Then IT re constructs I I 784 Decode p f TI 200 p I Bottleneck 20 p encode E I zoo P E I 784 Simplest Version mff sty sf ILcx.xy llx x.li W V Note This linear reduction dam and is just should look familiar _Vz Wx I z l l l l
l l I l dxm dx1 next Mxd More next Time
Recommend
More recommend