free component analysis raj rao
play

Free Component Analysis Raj Rao Dept. of Electrical Engg. & - PowerPoint PPT Presentation

Free Component Analysis Raj Rao Dept. of Electrical Engg. & Computer Science www.eecs.umich.edu/~rajnrao Joint work with Hao Wu Funding by ONR, DARPA, ARO 1 Free Component Analysis (FCA) An experiment in pictures 2 I1 = Hedgehog 3 I2


  1. Free Component Analysis Raj Rao Dept. of Electrical Engg. & Computer Science www.eecs.umich.edu/~rajnrao Joint work with Hao Wu Funding by ONR, DARPA, ARO 1

  2. Free Component Analysis (FCA) An experiment in pictures 2

  3. I1 = Hedgehog 3

  4. I2 = Panda 4

  5. Mixed Image 1 5

  6. Mixed Image 2 6

  7. Mixing Images 7

  8. This talk: Unmixing Images from Mixed Images (Toy) Mixing model: M 1 = 0 . 5 I 1 + 0 . 5 I 2 M 2 = 0 . 5 I 1 − 0 . 5 I 2 Perfect-unmixing algorithm: U 1 = 1 M 1 +1 M 2 U 2 = 1 M 1 − 1 M 2 Goal of FCA: • Unmix images with no prior knowledge of – mixing model – image structure e.g. ”what does typical hedgehog look like?” • Questions answered in this talk: Can this be done? When? How well? Theory? 8

  9. Algorithm(s) for FCA Perfect-unmixing algorithm: U 1 = 1 M 1 +1 M 2 Strategy: Cast as the optimization problem w 1 , w 2 = arg max f ( w 1 M 1 + w 2 M 2 ) subject to w 2 1 + w 2 2 = 2 Choice of f : ⇐ Important! 9

  10. An algorithm for FCA Strategy: w 1 , w 2 = arg max | f ( w 1 M 1 + w 2 M 2 ) | subject to w 2 1 + w 2 2 = 2 Choice of f : ⇐ where the magic happens! • X = n × m matrix and σ i ( X ) = i -th singular value of X : � � 2 � � σ 4 σ 2 f ( X ) = (1 /n ) i ( X ) − (1 + n/m ) 1 /n i ( X ) i i • f ( X ) is the“’free” fourth (rectangular) cumulant of X • Insight: w 1 = 1 and w 2 = 1 ⇒ Success • Display w 1 M 1 + w 2 M 2 .. what do we get? 10

  11. Mixed Images vs Original Images 11

  12. Mixed vs Unmixed Images 12

  13. Unmixed Image 1 vs Image 1 13

  14. Zoom into Unmixed Image 1 • Near perfect unmixing: Did we get lucky? 14

  15. This talk: Unmixing Images from Mixed Images (Toy) Mixing model: M 1 = 0 . 5 I 1 + 0 . 5 I 2 M 2 = 0 . 5 I 1 − 0 . 5 I 2 Perfect-unmixing algorithm: U 1 = 1 M 1 +1 M 2 U 2 = 1 M 1 − 1 M 2 Goal of FCA: • Unmix images with no prior knowledge of – mixing model – image structure e.g. ”what does typical hedgehog look like?” • Questions answered in this talk: Can this be done? When? How well? Theory? 15

  16. I1 16

  17. I2 17

  18. Mixed Images 18

  19. Unmixed Images: FCA 19

  20. Unmixed Images: ICA 20

  21. Free Component Analysis (FCA) A great but not-perfect unmixing example 21

  22. I1 = NYC 22

  23. I2 = Berlin 23

  24. Mixed Image 1 24

  25. Mixed Image 2 25

  26. Mixed vs FCA Unmixed 26

  27. Unmixed Image 1 27

  28. Unmixed Image 1: Zoom In • Great but not near-perfect unmixing 28

  29. Free Component Analysis (FCA) A not-great at all example 29

  30. I1 30

  31. I2 31

  32. Mixed vs FCA Unmixed • Quiz: How to make FCA great again? (Q: Why does this not work? More in a bit..) 32

  33. Theory: FCA – Setup Mixing model: Assume s non-commutative random variables are being mixed as     y 1 x 1 . .  .  = A  .  . . y s x s Covariance matrix: Σ ij = Σ ji = φ ( x i x ∗ j ) Random matrix connection: � 1 � ≈ 1 φ ( x i x ∗ n Tr ( X i X ∗ n Tr ( X i X ∗ j )” = ” lim j ) j ) n E 33

  34. Theory: FCA – Recovery guarantee Mixing model:     y 1 x 1 . . . .   = A   . . y s x s FCA: Find s directions that maximize aboslute value of free kurtosis + a bit more Theorem [N. and Wu, ’17]: FCA with free kurtosis objective function perfectly unmixes signals • x 1 , . . . x s are freely independent • invertible A , Σ = I • Free kurtosis � = 0 • At most one ”i.i.d. Gaussian” like random matrix 34

  35. Proof: FCA – Recovery guarantee � � Orthogonal Mixing model: Q = q 1 . . . q s is an orthogonal matrix. Let     y 1 x 1 . .  .  = Q  .  . . y s x s Algorithm: || w || 2=1 | κ 4 ( w T y ) | w opt = arg max Claim: Assume | κ 4 ( x 1 ) | > | κ 4 ( x 2 ) | > . . . > | κ 4 ( x s ) | then: w opt = ± q 1 35

  36. Sketch of proof Step 1: Change of variables κ 4 ( w T y ) = κ 4 ( w T Qx ) = κ 4 ( � w T x ) w T = w T Q ⇒ || � • � w || 2 = 1 as well w T x ) | w opt = arg max � w || 2=1 | κ 4 ( � || � Claim: We are done if we can show that w opt = ± e 1 � 36

  37. Sketch of Proof – continued Equivalent optimization problem: w T x ) | w opt = arg max � w || 2=1 | κ 4 ( � || � Expanding terms on right hand side: w T x ) = κ 4 ( � κ 4 ( � w 1 x 1 + . . . + � w s x s ) Properties of (free) cumulants: If x 1 and x 2 are (freely) independent • Additivity: κ i ( x 1 + x 2 ) = κ i ( x 1 ) + κ i ( x 2 ) • Homogeneity: κ i ( c x ) = c i κ i ( x ) • First cumulant is mean, second cumulant is variance and so on 37

  38. Sketch of Proof – continued Properties of (free) cumulants: If x 1 and x 2 are (freely) independent • Additivity: κ i ( x 1 + x 2 ) = κ i ( x 1 ) + κ i ( x 2 ) • Homogeneity: κ i ( c x ) = c i κ i ( x ) Expanding terms on right hand side: w T x ) = κ 4 ( � κ 4 ( � w 1 x 1 + . . . + � x s x s ) = κ 4 ( � w 1 x 1 ) + . . . + κ 4 ( � w s x s ) by additivity w 4 w 4 = � 1 κ 4 ( x 1 ) + . . . + � s κ 4 ( x s ) by homogeneity 38

  39. Sketch of Proof – continued Expanding terms on right hand side: � w T x ) = w 4 κ 4 ( � κ 4 ( x i ) � i i Bounding the abs. kurtosis: 4 � w T x ) | ≤ max {| κ 4 ( x i ) } i w 4 ⇒ | κ 4 ( � i i =1 � w 2 w 4 w 2 ≤ | κ 4 ( x 1 ) | � since � i ≤ � i on the sphere i i � w 2 ≤ | κ 4 ( x 1 ) | · 1 since � i = 1 i 39

  40. Sketch of Proof – final piece Upper-bound: w T x ) | ≤ | κ 4 ( x 1 ) | ⇒ � ⇒ | κ 4 ( � w opt = ± e 1 Change of variables: Since w opt = Q � w opt we have that || w || 2=1 | κ 4 ( w T y ) | ⇒ w opt = ± q 1 = arg max Linear unmixing transformation: ⇒ w T opt Qy = ± e T 1 y = ± y 1 • Same problem + spherical + orth. constraints ⇒ y 2 , . . . 40

  41. Sketch of proof: FCA – Recovery guarantee Key inequality: Uses fact that cumulants of free rvs add + spherical constraints | κ 4 ( q T x ) | ≤ max( | κ 4 ( x 1 ) | , | κ 4 ( x 2 ) | ) Extremal recovery property:   q 1 = 1 , q 2 = 0 if | κ 4 ( x 1 ) | > | κ 4 ( x 2 ) |    || q || =1 | κ 4 ( q T x ) | = ⇒ arg max q 1 = 0 , q 2 = 1 if | κ 4 ( x 1 ) | < | κ 4 ( x 2 ) |     either of above if | κ 4 ( x 1 ) | = | κ 4 ( x 2 ) | • Similar property for other higher order free cumulants 41

  42. FCA algorithm: Whitening step Input: Z = [ Z 1 , · · · , Z s ] T ∈ R sN × M where Z i ∈ R N × M . 1. Compute Z = [ 1 M Z 1 1 M 1 T M , · · · , 1 M Z n 1 M 1 T M ] T 2. Compute � Z = Z − Z . 3. Compute the n × n covariance matrix C where for i, j = 1 , . . . , s : C ij = 1 Z T N Tr( � Z i � j ) . 4. Compute eigenvalue decomposition C = U Λ 2 U T . 5. Compute Y = ((Λ − 1 U T ) ⊗ I N ) ¯ Z . 6. return: Y , Λ and U . 42

  43. FCA algorithm: Optimal orthogonal matrix finding step Input : Z = [ Z 1 , · · · , Z s ] T ∈ R sN × M where Z i ∈ R N × M 1. Compute Y , Λ , U by whitening Z . Then compute � �� � �� � � � � κ r W T Y � where � W = arg max �� 4 . � , W = W ⊗ I N W ∈ O ( s ) where � � � 1 � 2 � ( XX T ) 2 � κ 4 ( X ) = 1 1 + N N Tr( XX T ) � N Tr − M A − 1 ⊗ I N ) Z such that Z = ( � 3. Compute � A = U Λ � W and � X = ( � A ⊗ I N ) � X 4. return: � A and � X 43

  44. Asymptotic Freeness Let ( X , ϕ ) be a non-commutative probability space and fix a positive integer n ≥ 1 . For each i ∈ I , let X i ⊂ X be a unital subalgebra. The subalgebras ( X i ) i ∈ I are called freely independent (or simply free), if for all k ≥ 1 ϕ ( x 1 · · · x k ) = 0 whenever ϕ ( x j ) = 0 for all j = 1 , · · · , k, and neighboring elements are from diffierent subalgebras, i.e. x j ∈ X i ( j ) , i (1) � = i (2) , i (2) � = i (3) , · · · , i ( k − 1) � = i ( k ) . • Analog of E [ x 1 x 2 x 3 ] = 0 whenever E [ x 1 ] = 0 , E [ x 2 ] = 0 and E [ x 3 ] = 0 Random matrix connection: � 1 � ≈ 1 φ ( x i x ∗ n Tr ( X i X T n Tr ( X i X ∗ j ) = lim j ) j ) n E 44

  45. (Asymptotically) Free vs UnFree Free: Singular vectors of matrix pair “incoherently related” (as can be) +relaxable UnFree: Singular vectors of matrix pair are ”coherently related” • Insight: FCA fail/success ⇔ Trump/Clinton v. Berlin/NYC v. Panda/Hedgehog 45

  46. Making FCA great again • Insight: Is there a sub-matrix (pair) that is ”more” free? Question: Where? 46

  47. ICA vs FCA Mixing model:     y 1 x 1 . .  .  = A  .  . . y s x s FCA: Find s directions that maximize aboslute value of free kurtosis + a bit more ICA: Find s directions that maximize aboslute value of classical kurtosis + a bit more • FCA ⇔ Matrices • ICA ⇔ Scalars • Note; both FCA and ICA can be applied to same data! • Question: Which unmixes better? 47

  48. Unmixed Images: FCA vs ICA 48

  49. ICA vs FCA 49

  50. ICA vs FCA: FCA wins! 50

  51. ICA vs FCA: Objective functions • Insight: scalar κ 4 ≈ 0 , matrix κ 4 ≫ 0 ⇒ Power of embedding! 51

  52. FCA: Free entropy vs Free Kurtosis Maximization 52

  53. ICA vs FCA • Insight: FCA with Free entropy ≫ Free Kurtosis! 53

  54. Application: FCA to unmix speech • How to apply FCA to mixed speech signals? 54

Recommend


More recommend