Random Projection dimensionality & High Lecture 4 . . - Sep -11,2017 Recall pxn Xn ] X=[ data xi , ... , - nt11T . 1=f!]£RP data Xc= H=I centered XH , ¥3 Uk§kVaT XCE best . k as rank approximation of Xc 0k£ Rpxk ) orthogonal € prnxk column f ,< mat . §k= diagcoii - FKI disks sow . . . . . , , , Cox ,§k ) . PCA with projection k is by given - a sieve §=[ Fi fikuiixc ii. each column new coordinates gives C⇒ Decomposition of Covariance Mat Eigenvalue . ' ,aeSI 'i=dkAk0I In ntxex a ( §k ,V^⇒ " data i with given k MDS is by representation . - e Rkxn grate - ⇐ Kernel Gigenvaluedecomp , Mat of . ntxixc write K ¥ . kernel.PH/iUD=Ksois semi definite " positive VIAKVIT ' Be HKH I
dimensionality ? of data Prddet high about big What : pm n > > i data In big '5 tntfcxitikxiti n > > I . - . good approximation # nd .dµ§a down sample n' → in restricted on subsample In high dimensionality too big ERTKP pm . sooompute , Xixc ? K = = approximate ? to easy Trojan ! ¥ Ad # NJVCO R= l ) Aij e. g. where . • , (Rxc)d× " XD " " → dap , . = XIRTRXC approximation of K ! Kr good a . Aijnfj Petz ftp.flz • Aij zeros ! =f ! Pets with sparse many I -210=43 4 p=%
Project HGDP ) 8×amp_ ( Genome Human Diversity http://www.cephb.fr/en/hgdp.panel.phph=1ob4 644,258 SNPS p= persons " XP " " " ; " " " " AC 9 Missing " CC Xij " AA 2 0 ' : 1 : = ; : : : , values Removing with 21 persons missing . µ ROKP of d r.ws/sNps Xi select randomly dxn I UISIVIT RXH Rxc ¥ = = akk ktn kxk D= took . al 5k d=p= book 644.258×1043 - . t t t , ✓ e. d) ( §ad results In all good ! cases are . §adV^I,d ERK " ' Here coordinates k=z PCA . : , . Why work ? it does .
%hnson-Lind=mma then . Xjtl Xie RP dij : =Hxi it in , " , . " Look for a transform ' f= Xii Y ; D= Ocddbgn ER → ) , - Yjll at Mi . 1- E E SHE with - - xjll probability uxi . - na zl no Uniform E- Isometry ! . ↳ o - " ⇒ . distortion uniformly metric is relative bounded by E ! ! random f projection is a Extension Linden Johnson Lipschutz 1980 's Strauss . , Anupam Gupta Sanjay Dasgupta was Achlioptas Dimitri 's . data Computer Science compression , neighbor search nearest Tim Given i ) EECO n a > 0 , , , . Let Rack , e) fgn =C4+2&)(¥ - E÷)→tgn XEERSD Then for Cia ,n ) points there exists n any ... . , : RD→Rk VII. Xj f set a map . flxj ) lp Hfcxi ) - probs .li?nx?-xxsH2 * , win
least - n× probability holds with at I at ) • , be found randomized polynomial time f in can • projections ) random C D " a G. ,rdT , XERD ftp.RX.R.cn , ER ... 't dim of D- I sphere ries = cai.no#.ajn~Nco . e. g. r ; D Vaill ,
Recommend
More recommend