welcome back
play

Welcome back. Projects comments available on Glookup! Welcome back. - PowerPoint PPT Presentation

Welcome back. Projects comments available on Glookup! Welcome back. Projects comments available on Glookup! Turn in homework! Welcome back. Projects comments available on Glookup! Turn in homework! I am away April 15-20. Welcome back.


  1. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations.

  2. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations. roughly d ε 2

  3. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations. roughly d ε 2 √ Signal >> Noise. ↔ d ε 2 >> d σ 2 .

  4. Gaussians Population 1: Gaussion with mean µ 1 ∈ R d , std deviation σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , std deviation σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. How many snps to collect to determine population for individual x ? x in population 1. E [( x − µ 1 ) 2 ] = d σ 2 E [( x − µ 2 ) 2 ] ≥ ( d − 1 ) σ 2 +( µ 1 − µ 2 ) 2 . If ( µ 1 − µ 2 ) 2 = d ε 2 >> σ 2 , then different. → take d >> σ 2 / ε 2 Variance of estimator? Roughly d σ 4 . Signal is difference between expecations. roughly d ε 2 √ Signal >> Noise. ↔ d ε 2 >> d σ 2 . Need d >> σ 4 / ε 4 .

  5. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim.

  6. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp.

  7. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 .

  8. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1.

  9. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2.

  10. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. Std deviation is σ 2 !

  11. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 !

  12. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal!

  13. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 .

  14. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 . → d >> σ 2 / ε 2

  15. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 . → d >> σ 2 / ε 2 Versus d >> σ 4 / ε 4 .

  16. Projection Population 1: Gaussion with mean µ 1 ∈ R d , variance σ in each dim. Population 2: Gaussion with mean µ 2 ∈ R d , variance σ in each dim. Difference between humans σ per snp. Difference between populations ε per snp. Project x onto unit vector v in direction µ 2 − µ 1 . E [(( x − µ 1 ) · v ) 2 ] = 0 if x is population 1. E [(( x − µ 2 ) · v ) 2 ] ≥ ( µ 1 − µ 2 ) 2 if x is population 2. √ Std deviation is σ 2 ! versus d σ 2 ! No loss in signal! d ε 2 >> σ 2 . → d >> σ 2 / ε 2 Versus d >> σ 4 / ε 4 . A quadratic difference in amount of data!

  17. Don’t know much about... Don’t know µ 1 or µ 2 ?

  18. Without the means? Sample of n people.

  19. Without the means? Sample of n people. Some (say half) from population 1,

  20. Without the means? Sample of n people. Some (say half) from population 1, some from population 2.

  21. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which?

  22. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach

  23. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared.

  24. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold.

  25. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )]

  26. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y )

  27. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other.

  28. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 .

  29. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to

  30. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other.

  31. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other. d >> ( σ 4 / ε 4 ) log n suffices for threshold clustering.

  32. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other. d >> ( σ 4 / ε 4 ) log n suffices for threshold clustering. � n � log n factor for union bound over pairs. 2

  33. Without the means? Sample of n people. Some (say half) from population 1, some from population 2. Which are which? Near Neighbors Approach Compute Euclidean distance squared. Cluster using threshold. Signal E [ d ( x 1 , x 2 )] − E [ d ( x 1 , y 1 )] should be larger than noise in d ( x , y ) Where x ’s from one population, y ’s from other. Signal is proportional d ε 2 . √ d σ 2 . Noise is proportional to d >> σ 4 / ε 4 → same type people closer to each other. d >> ( σ 4 / ε 4 ) log n suffices for threshold clustering. � n � log n factor for union bound over pairs. 2 Best one can do?

  34. Principal components analysis. Remember Projection!

  35. Principal components analysis. Remember Projection!

  36. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ?

  37. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis:

  38. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance.

  39. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points)

  40. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population.

  41. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 .

  42. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 ,

  43. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 .

  44. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 .

  45. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least.

  46. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability?

  47. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions.

  48. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions. How many directions?

  49. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions. How many directions? Infinity

  50. Principal components analysis. Remember Projection! Don’t know µ 1 or µ 2 ? Principal component analysis: Find direction, v , of maximum variance. Maximize ∑ ( x · v ) 2 (zero center the points) Recall: ( x · v ) 2 could determine population. Typical direction variance. n σ 2 . Direction along µ 1 − µ 2 , ∝ n ( µ 1 − µ 2 ) 2 . ∝ nd ε 2 . Need d >> σ 2 / ε 2 at least. When will PCA pick correct direction with good probability? Union bound over directions. How many directions? Infinity and beyond!

  51. Nets “ δ - Net”.

  52. Nets “ δ - Net”. Set D of directions

  53. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D .

  54. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ .

  55. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net:

  56. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] .

  57. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ

  58. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction.

  59. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net.

  60. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 .

  61. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 .

  62. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 . nd >> ( σ 4 / ε 4 ) log d and d >> σ 2 / ε 2 works.

  63. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 . nd >> ( σ 4 / ε 4 ) log d and d >> σ 2 / ε 2 works. Nearest neighbor works with very high d > σ 4 / ε 4 .

  64. Nets “ δ - Net”. Set D of directions where all others, v , are close to x ∈ D . x · v ≥ 1 − δ . δ - Net: [ ··· , i δ / d , ··· ] integers i ∈ [ − d / δ , d δ ] . � O ( d ) � d Total of N ∝ vectors in net. δ Signal >> Noise times log N = O ( d log d δ ) to isolate direction. log N is due to union bound over vectors in net. Signal (exp. projection): ∝ nd ε 2 . Noise (std dev.): √ n σ 2 . nd >> ( σ 4 / ε 4 ) log d and d >> σ 2 / ε 2 works. Nearest neighbor works with very high d > σ 4 / ε 4 . PCA can reduce d to “knowing centers” case, with reasonable number of sample points.

  65. PCA calculation. Matrix A where rows are points.

Recommend


More recommend