Free Component Analysis Raj Rao Dept. of Electrical Engg. & Computer Science www.eecs.umich.edu/~rajnrao Joint work with Hao Wu Funding by ONR, DARPA, ARO 1
Free Component Analysis (FCA) An experiment in pictures 2
I1 = Hedgehog 3
I2 = Panda 4
Mixed Image 1 5
Mixed Image 2 6
Mixing Images 7
This talk: Unmixing Images from Mixed Images (Toy) Mixing model: M 1 = 0 . 5 I 1 + 0 . 5 I 2 M 2 = 0 . 5 I 1 − 0 . 5 I 2 Perfect-unmixing algorithm: U 1 = 1 M 1 +1 M 2 U 2 = 1 M 1 − 1 M 2 Goal of FCA: • Unmix images with no prior knowledge of – mixing model – image structure e.g. ”what does typical hedgehog look like?” • Questions answered in this talk: Can this be done? When? How well? Theory? 8
Algorithm(s) for FCA Perfect-unmixing algorithm: U 1 = 1 M 1 +1 M 2 Strategy: Cast as the optimization problem w 1 , w 2 = arg max f ( w 1 M 1 + w 2 M 2 ) subject to w 2 1 + w 2 2 = 2 Choice of f : ⇐ Important! 9
An algorithm for FCA Strategy: w 1 , w 2 = arg max | f ( w 1 M 1 + w 2 M 2 ) | subject to w 2 1 + w 2 2 = 2 Choice of f : ⇐ where the magic happens! • X = n × m matrix and σ i ( X ) = i -th singular value of X : � � 2 � � σ 4 σ 2 f ( X ) = (1 /n ) i ( X ) − (1 + n/m ) 1 /n i ( X ) i i • f ( X ) is the“’free” fourth (rectangular) cumulant of X • Insight: w 1 = 1 and w 2 = 1 ⇒ Success • Display w 1 M 1 + w 2 M 2 .. what do we get? 10
Mixed Images vs Original Images 11
Mixed vs Unmixed Images 12
Unmixed Image 1 vs Image 1 13
Zoom into Unmixed Image 1 • Near perfect unmixing: Did we get lucky? 14
This talk: Unmixing Images from Mixed Images (Toy) Mixing model: M 1 = 0 . 5 I 1 + 0 . 5 I 2 M 2 = 0 . 5 I 1 − 0 . 5 I 2 Perfect-unmixing algorithm: U 1 = 1 M 1 +1 M 2 U 2 = 1 M 1 − 1 M 2 Goal of FCA: • Unmix images with no prior knowledge of – mixing model – image structure e.g. ”what does typical hedgehog look like?” • Questions answered in this talk: Can this be done? When? How well? Theory? 15
I1 16
I2 17
Mixed Images 18
Unmixed Images: FCA 19
Unmixed Images: ICA 20
Free Component Analysis (FCA) A great but not-perfect unmixing example 21
I1 = NYC 22
I2 = Berlin 23
Mixed Image 1 24
Mixed Image 2 25
Mixed vs FCA Unmixed 26
Unmixed Image 1 27
Unmixed Image 1: Zoom In • Great but not near-perfect unmixing 28
Free Component Analysis (FCA) A not-great at all example 29
I1 30
I2 31
Mixed vs FCA Unmixed • Quiz: How to make FCA great again? (Q: Why does this not work? More in a bit..) 32
Theory: FCA – Setup Mixing model: Assume s non-commutative random variables are being mixed as y 1 x 1 . . . = A . . . y s x s Covariance matrix: Σ ij = Σ ji = φ ( x i x ∗ j ) Random matrix connection: � 1 � ≈ 1 φ ( x i x ∗ n Tr ( X i X ∗ n Tr ( X i X ∗ j )” = ” lim j ) j ) n E 33
Theory: FCA – Recovery guarantee Mixing model: y 1 x 1 . . . . = A . . y s x s FCA: Find s directions that maximize aboslute value of free kurtosis + a bit more Theorem [N. and Wu, ’17]: FCA with free kurtosis objective function perfectly unmixes signals • x 1 , . . . x s are freely independent • invertible A , Σ = I • Free kurtosis � = 0 • At most one ”i.i.d. Gaussian” like random matrix 34
Proof: FCA – Recovery guarantee � � Orthogonal Mixing model: Q = q 1 . . . q s is an orthogonal matrix. Let y 1 x 1 . . . = Q . . . y s x s Algorithm: || w || 2=1 | κ 4 ( w T y ) | w opt = arg max Claim: Assume | κ 4 ( x 1 ) | > | κ 4 ( x 2 ) | > . . . > | κ 4 ( x s ) | then: w opt = ± q 1 35
Sketch of proof Step 1: Change of variables κ 4 ( w T y ) = κ 4 ( w T Qx ) = κ 4 ( � w T x ) w T = w T Q ⇒ || � • � w || 2 = 1 as well w T x ) | w opt = arg max � w || 2=1 | κ 4 ( � || � Claim: We are done if we can show that w opt = ± e 1 � 36
Sketch of Proof – continued Equivalent optimization problem: w T x ) | w opt = arg max � w || 2=1 | κ 4 ( � || � Expanding terms on right hand side: w T x ) = κ 4 ( � κ 4 ( � w 1 x 1 + . . . + � w s x s ) Properties of (free) cumulants: If x 1 and x 2 are (freely) independent • Additivity: κ i ( x 1 + x 2 ) = κ i ( x 1 ) + κ i ( x 2 ) • Homogeneity: κ i ( c x ) = c i κ i ( x ) • First cumulant is mean, second cumulant is variance and so on 37
Sketch of Proof – continued Properties of (free) cumulants: If x 1 and x 2 are (freely) independent • Additivity: κ i ( x 1 + x 2 ) = κ i ( x 1 ) + κ i ( x 2 ) • Homogeneity: κ i ( c x ) = c i κ i ( x ) Expanding terms on right hand side: w T x ) = κ 4 ( � κ 4 ( � w 1 x 1 + . . . + � x s x s ) = κ 4 ( � w 1 x 1 ) + . . . + κ 4 ( � w s x s ) by additivity w 4 w 4 = � 1 κ 4 ( x 1 ) + . . . + � s κ 4 ( x s ) by homogeneity 38
Sketch of Proof – continued Expanding terms on right hand side: � w T x ) = w 4 κ 4 ( � κ 4 ( x i ) � i i Bounding the abs. kurtosis: 4 � w T x ) | ≤ max {| κ 4 ( x i ) } i w 4 ⇒ | κ 4 ( � i i =1 � w 2 w 4 w 2 ≤ | κ 4 ( x 1 ) | � since � i ≤ � i on the sphere i i � w 2 ≤ | κ 4 ( x 1 ) | · 1 since � i = 1 i 39
Sketch of Proof – final piece Upper-bound: w T x ) | ≤ | κ 4 ( x 1 ) | ⇒ � ⇒ | κ 4 ( � w opt = ± e 1 Change of variables: Since w opt = Q � w opt we have that || w || 2=1 | κ 4 ( w T y ) | ⇒ w opt = ± q 1 = arg max Linear unmixing transformation: ⇒ w T opt Qy = ± e T 1 y = ± y 1 • Same problem + spherical + orth. constraints ⇒ y 2 , . . . 40
Sketch of proof: FCA – Recovery guarantee Key inequality: Uses fact that cumulants of free rvs add + spherical constraints | κ 4 ( q T x ) | ≤ max( | κ 4 ( x 1 ) | , | κ 4 ( x 2 ) | ) Extremal recovery property: q 1 = 1 , q 2 = 0 if | κ 4 ( x 1 ) | > | κ 4 ( x 2 ) | || q || =1 | κ 4 ( q T x ) | = ⇒ arg max q 1 = 0 , q 2 = 1 if | κ 4 ( x 1 ) | < | κ 4 ( x 2 ) | either of above if | κ 4 ( x 1 ) | = | κ 4 ( x 2 ) | • Similar property for other higher order free cumulants 41
FCA algorithm: Whitening step Input: Z = [ Z 1 , · · · , Z s ] T ∈ R sN × M where Z i ∈ R N × M . 1. Compute Z = [ 1 M Z 1 1 M 1 T M , · · · , 1 M Z n 1 M 1 T M ] T 2. Compute � Z = Z − Z . 3. Compute the n × n covariance matrix C where for i, j = 1 , . . . , s : C ij = 1 Z T N Tr( � Z i � j ) . 4. Compute eigenvalue decomposition C = U Λ 2 U T . 5. Compute Y = ((Λ − 1 U T ) ⊗ I N ) ¯ Z . 6. return: Y , Λ and U . 42
FCA algorithm: Optimal orthogonal matrix finding step Input : Z = [ Z 1 , · · · , Z s ] T ∈ R sN × M where Z i ∈ R N × M 1. Compute Y , Λ , U by whitening Z . Then compute � �� � �� � � � � κ r W T Y � where � W = arg max �� 4 . � , W = W ⊗ I N W ∈ O ( s ) where � � � 1 � 2 � ( XX T ) 2 � κ 4 ( X ) = 1 1 + N N Tr( XX T ) � N Tr − M A − 1 ⊗ I N ) Z such that Z = ( � 3. Compute � A = U Λ � W and � X = ( � A ⊗ I N ) � X 4. return: � A and � X 43
Asymptotic Freeness Let ( X , ϕ ) be a non-commutative probability space and fix a positive integer n ≥ 1 . For each i ∈ I , let X i ⊂ X be a unital subalgebra. The subalgebras ( X i ) i ∈ I are called freely independent (or simply free), if for all k ≥ 1 ϕ ( x 1 · · · x k ) = 0 whenever ϕ ( x j ) = 0 for all j = 1 , · · · , k, and neighboring elements are from diffierent subalgebras, i.e. x j ∈ X i ( j ) , i (1) � = i (2) , i (2) � = i (3) , · · · , i ( k − 1) � = i ( k ) . • Analog of E [ x 1 x 2 x 3 ] = 0 whenever E [ x 1 ] = 0 , E [ x 2 ] = 0 and E [ x 3 ] = 0 Random matrix connection: � 1 � ≈ 1 φ ( x i x ∗ n Tr ( X i X T n Tr ( X i X ∗ j ) = lim j ) j ) n E 44
(Asymptotically) Free vs UnFree Free: Singular vectors of matrix pair “incoherently related” (as can be) +relaxable UnFree: Singular vectors of matrix pair are ”coherently related” • Insight: FCA fail/success ⇔ Trump/Clinton v. Berlin/NYC v. Panda/Hedgehog 45
Making FCA great again • Insight: Is there a sub-matrix (pair) that is ”more” free? Question: Where? 46
ICA vs FCA Mixing model: y 1 x 1 . . . = A . . . y s x s FCA: Find s directions that maximize aboslute value of free kurtosis + a bit more ICA: Find s directions that maximize aboslute value of classical kurtosis + a bit more • FCA ⇔ Matrices • ICA ⇔ Scalars • Note; both FCA and ICA can be applied to same data! • Question: Which unmixes better? 47
Unmixed Images: FCA vs ICA 48
ICA vs FCA 49
ICA vs FCA: FCA wins! 50
ICA vs FCA: Objective functions • Insight: scalar κ 4 ≈ 0 , matrix κ 4 ≫ 0 ⇒ Power of embedding! 51
FCA: Free entropy vs Free Kurtosis Maximization 52
ICA vs FCA • Insight: FCA with Free entropy ≫ Free Kurtosis! 53
Application: FCA to unmix speech • How to apply FCA to mixed speech signals? 54
Recommend
More recommend