sigtacs seminar series
play

SIGTACS Seminar Series Metric Embeddings and Applications in - PowerPoint PPT Presentation

Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion SIGTACS Seminar Series Metric Embeddings and Applications in Computer Science Presented by : Purushottam Kar January 10, 2009 SIGTACS Seminar Series


  1. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion SIGTACS Seminar Series Metric Embeddings and Applications in Computer Science Presented by : Purushottam Kar January 10, 2009 SIGTACS Seminar Series 1 / 23

  2. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Outline Introduction 1 Embeddings into Normed Spaces 2 Dimensionality Reduction 3 The JL Lemma 4 Discussion 5 SIGTACS Seminar Series 2 / 23

  3. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Basics Definition (Metric) A Metric is a structure ( X , ρ ) where ρ is a distance measure ρ : X × X → R which is non-negative, symmetric and satisfies the triangle inequality. Definition (Embedding Distortion) An embedding f : X → Y from a metric space ( X , ρ ) to another metric space ( Y , σ ) is said to have a distortion D if σ ( f ( x ) , f ( y )) ρ ( x , y ) D = sup · sup σ ( f ( x ) , f ( y )) . ρ ( x , y ) x , y ∈ X x , y ∈ X Such embeddings are also called bi-Lipschitz embeddings. SIGTACS Seminar Series 3 / 23

  4. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embeddings Various criterion used to evaluate embeddings SIGTACS Seminar Series 4 / 23

  5. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embeddings Various criterion used to evaluate embeddings Distortion, Stress, Residual Variance ... Definition (Embedding Stress) The stress for an embedding f : X → Y from a metric space ( X , ρ ) to � ( σ ( f ( x ) , f ( y )) − ρ ( x , y )) 2 � � � x , y ∈ X another metric space ( Y , σ ) is defined to be . � � ρ ( x , y ) 2 x , y ∈ X SIGTACS Seminar Series 4 / 23

  6. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embeddings Various criterion used to evaluate embeddings Distortion, Stress, Residual Variance ... Definition (Embedding Stress) The stress for an embedding f : X → Y from a metric space ( X , ρ ) to � ( σ ( f ( x ) , f ( y )) − ρ ( x , y )) 2 � � � x , y ∈ X another metric space ( Y , σ ) is defined to be . � � ρ ( x , y ) 2 x , y ∈ X Lead to very interesting algorithmic questions SIGTACS Seminar Series 4 / 23

  7. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis SIGTACS Seminar Series 5 / 23

  8. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis Algorithmic applications SIGTACS Seminar Series 5 / 23

  9. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis Algorithmic applications Metric Embeddings for datasets operating with a non-metric SIGTACS Seminar Series 5 / 23

  10. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis Algorithmic applications Metric Embeddings for datasets operating with a non-metric Dimensionality reduction to reduce storage space costs, processing time SIGTACS Seminar Series 5 / 23

  11. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis Algorithmic applications Metric Embeddings for datasets operating with a non-metric Dimensionality reduction to reduce storage space costs, processing time Facilitate pruning procedures in database searches SIGTACS Seminar Series 5 / 23

  12. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis Algorithmic applications Metric Embeddings for datasets operating with a non-metric Dimensionality reduction to reduce storage space costs, processing time Facilitate pruning procedures in database searches Preserve residual variance (PCA), inter-point similarity (Random Projections), Stress (MDS) SIGTACS Seminar Series 5 / 23

  13. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Application in Computer Science Started out as a branch of functional analysis Algorithmic applications Metric Embeddings for datasets operating with a non-metric Dimensionality reduction to reduce storage space costs, processing time Facilitate pruning procedures in database searches Preserve residual variance (PCA), inter-point similarity (Random Projections), Stress (MDS) Streaming Algorithms SIGTACS Seminar Series 5 / 23

  14. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l ∞ Theorem (Fr´ etchet’s Embedding) Every n-point metric can be isometrically embedded into l ∞ Fr´ echet’s Embedding technique - non-expansive SIGTACS Seminar Series 6 / 23

  15. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l ∞ Theorem (Fr´ etchet’s Embedding) Every n-point metric can be isometrically embedded into l ∞ Fr´ echet’s Embedding technique - non-expansive Choose coordinates as projections onto some fixed sets SIGTACS Seminar Series 6 / 23

  16. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l ∞ Theorem (Fr´ etchet’s Embedding) Every n-point metric can be isometrically embedded into l ∞ Fr´ echet’s Embedding technique - non-expansive Choose coordinates as projections onto some fixed sets Triangle inequality ensures contractive embeddings SIGTACS Seminar Series 6 / 23

  17. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l ∞ Theorem (Fr´ etchet’s Embedding) Every n-point metric can be isometrically embedded into l ∞ Fr´ echet’s Embedding technique - non-expansive Choose coordinates as projections onto some fixed sets Triangle inequality ensures contractive embeddings Choice of “landmark” sets gives other algorithms SIGTACS Seminar Series 6 / 23

  18. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l ∞ Theorem (Fr´ etchet’s Embedding) Every n-point metric can be isometrically embedded into l ∞ Fr´ echet’s Embedding technique - non-expansive Choose coordinates as projections onto some fixed sets Triangle inequality ensures contractive embeddings Choice of “landmark” sets gives other algorithms 1 q ln n ) by tolerating a Embedding dimension can be reduced to O ( qn distortion of 2 q − 1. SIGTACS Seminar Series 6 / 23

  19. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l 2 Theorem (Bourgain’s Embedding) Every n-point metric can be O (log n ) -embedded into l 2 Uses a random selection of the landmark sets SIGTACS Seminar Series 7 / 23

  20. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l 2 Theorem (Bourgain’s Embedding) Every n-point metric can be O (log n ) -embedded into l 2 Uses a random selection of the landmark sets Tight - The graph metric of a constant degree expander has Ω(log n ) distortion into any Euclidean space SIGTACS Seminar Series 7 / 23

  21. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Embedding into l 2 Theorem (Bourgain’s Embedding) Every n-point metric can be O (log n ) -embedded into l 2 Uses a random selection of the landmark sets Tight - The graph metric of a constant degree expander has Ω(log n ) distortion into any Euclidean space � √ log n � Any embedding of the Hamming cube into l 2 incurs Ω distortion SIGTACS Seminar Series 7 / 23

  22. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Dimensionality Reduction in l 1 Impossible - A D -embedding of n points may require n Ω(1 / D 2 ) dimensions SIGTACS Seminar Series 8 / 23

  23. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Dimensionality Reduction in l 1 Impossible - A D -embedding of n points may require n Ω(1 / D 2 ) dimensions No “flattening” results known for other l p metrics either ... SIGTACS Seminar Series 8 / 23

  24. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion Dimensionality Reduction in l 1 Impossible - A D -embedding of n points may require n Ω(1 / D 2 ) dimensions No “flattening” results known for other l p metrics either ... Except for p = 2 SIGTACS Seminar Series 8 / 23

  25. Introduction Embeddings into Normed Spaces Dimensionality Reduction The JL Lemma Discussion The Johnson-Lindenstrauss Lemma Theorem (The JL-Lemma) Given ǫ > 0 and integer n, let k ≥ k 0 = O ( ǫ − 2 log n ) . For every set P of n points in R d there exists f : R d − → R k such that for all u , v ∈ P (1 − ǫ ) � u − v � 2 ≤ � f ( u ) − f ( v ) � 2 ≤ (1 + ǫ ) � u − v � 2 . Implementation as a randomized algorithm SIGTACS Seminar Series 9 / 23

Recommend


More recommend