on the smallest enclosing information disk
play

On the Smallest Enclosing Information Disk Frank Nielsen 1 Richard - PowerPoint PPT Presentation

On the Smallest Enclosing Information Disk Frank Nielsen 1 Richard Nock 2 1 Sony Computer Science Laboratories, Inc. Fundamental Research Laboratory Frank.Nielsen@acm.org 2 University of Antilles-Guyanne DSI-GRIMAAG


  1. On the Smallest Enclosing Information Disk Frank Nielsen 1 Richard Nock 2 1 Sony Computer Science Laboratories, Inc. Fundamental Research Laboratory Frank.Nielsen@acm.org 2 University of Antilles-Guyanne DSI-GRIMAAG Richard.Nock@martinique.univ-ag.fr August 2006 F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  2. Smallest Enclosing Balls Problem Given S = { s 1 , ..., s n } , compute a simplified description, called the center , that fits well S (i.e., summarizes S ). Two optimization criteria: Find a center c ∗ which minimizes the average M IN A VG distortion w.r.t S : c ∗ = argmin c � i d ( c , s i ) . Find a center c ∗ which minimizes the maximal M IN M AX distortion w.r.t S : c ∗ = argmin c max i d ( c , s i ) . Investigated in Applied Mathematics: Computational geometry (1-center problem), Computational statistics (1-point estimator), Machine learning (1-class classification), F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  3. Smallest Enclosing Balls in Computational Geometry Distortion measure d ( · , · ) is the geometric distance : Euclidean distance L 2 . c ∗ is the circumcenter of S for M IN M AX , Squared Euclidean distance L 2 2 . c ∗ is the centroid of S for M IN A VG ( → k -means), Euclidean distance L 2 . c ∗ is the Fermat-Weber point for M IN A VG . Centroid Circumcenter Fermat-Weber M IN A VG L 2 M IN M AX L 2 M IN A VG L 2 2 F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  4. M IN M AX in Computational Geometry (MINIBALL) M IN M AX point set Smallest Enclosing Ball [NN’04] Pioneered by Sylvester (1857), Unique circumcenter c ∗ (radius r ∗ ), LP-type, linear-time randomized algorithm (fixed dimension d ), Weakly polynomial. Efficient SOCP numerical solver, Fast combinatorial heuristics ( d ≥ 1000). M IN M AX ball set F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  5. Distortions: Bregman Divergences Definition Bregman divergences are parameterized ( F ) families of distortions. Let F : X − → R , such that F is strictly convex and differentiable on int ( X ) , for a convex domain X ⊆ R d . Bregman divergence D F : D F ( x , y ) = F ( x ) − F ( y ) − � x − y , ∇ F ( y ) � . ∇ F : gradient operator of F �· , ·� : Inner product (dot product) ( → D F is the tail of a Taylor expansion of F ) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  6. Visualizing F and D F F ( · ) D F ( x , y ) � x − y , ∇ F ( y ) � y x D F ( x , y ) = F ( x ) − F ( y ) − � x − y , ∇ F ( y ) � . ( → D F is the a truncated Taylor expansion of F ) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  7. Bregman Balls (Information Balls) Euclidean Ball: B c , r = { x ∈ X : � x − c � 2 2 ≤ r } 2 : Bregman divergence F ( x ) = � d ( r : squared radius. L 2 i = 1 x 2 i ) Theorem [BMDG’04] The M IN A VG Ball for Bregman divergences is the centroid . F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  8. Two types of Bregman balls First-type: B c , r = { x ∈ X : D F ( c , x ) ≤ r } , Second-type: B ′ c , r = { x ∈ X : D F ( x , c ) ≤ r } Lemma The smallest enclosing Bregman balls B c ∗ , r ∗ and B ′ c ∗ , r ∗ of S are unique. − → Consider first-type Bregman balls. (The second-type is obtained as a first-type ball on the dual divergence D F ∗ using the Legendre-Fenchel transformation.) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  9. Applications of Bregman Balls Circumcenters of the smallest enclosing Bregman balls encode: Euclidean squared distance. The closest point to a set of points. d ( q i − p i ) 2 = || p || 2 + || q || 2 − 2 � p , q � . � D F ( p , q ) = i = 1 Itakura-Saito divergence. The closest (sound) signal to a set of signals (speech recognition). d d ( p i − log p i � � D F ( p , q ) = − 1 ) , [ ← F ( x ) = − log x i ] q i q i i = 1 i = 1 Kullback-Leibler. The closest distribution to a set of distributions (density estimation). d d p i log p i � � D F ( p , q ) = − p i + q i , [ F ( x ) = − x i log x i ] q i i = 1 i = 1 F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  10. Information Disks Problem Given a set S = { s 1 , ..., s n } of n 2D vector points, compute the M IN M AX center : c ∗ = argmin c max i d ( c , s i ) . handle geometric points for various distortions, handle parametric distributions ( e.g. , Normal distributions are parameterized by ( µ, σ ) ). F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  11. Information Disk is LP-type Monotonicity. For any F and G such that F ⊆ G ⊆ X , r ∗ ( F ) ≤ r ∗ ( G ) . Locality. For any F and G such that F ⊆ G ⊆ X with r ∗ ( F ) = r ∗ ( G ) , and any point p ∈ X , r ∗ ( G ) < r ∗ ( G ∪ { p } ) → r ∗ ( F ) < r ∗ ( F ∪ { p } ) . M INI I NFO B ALL ( S = { p 1 , ..., p n } , B ): ⊳ Initially B = ∅ . Returns B ∗ = ( c ∗ , r ∗ ) ⊲ IF |S ∪ B| ≤ 3 RETURN B =S OLVE I NFO B ASIS ( S ∪ B ) ELSE ⊳ Select at random p ∈ S ⊲ B ∗ =M INI I NFO B ALL ( S\{ p } , B ) IF p �∈ B ∗ ⊳ Then add p to the basis ⊲ M INI I NFO B ALL ( S\{ p } , B ∪ { p } ) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  12. Computing basis (S OLVE I NFO B ASIS ) Lemma The first-type Bregman bisector Bisector ( p , q ) = { c ∈ X | D F ( c , p ) = D F ( c , q ) } is linear. This is a linear equation in c (an hyperplane ). Bisector Bisector ( p , q ) = { x | � x , d pq � + k pq = 0 } with d pq = ∇ F ( p ) − ∇ F ( q ) a vector, and k pq = F ( p ) − F ( q ) + � q , ∇ F ( q ) � − � p , ∇ F ( p ) � a constant (Itakura-Saito divergence) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  13. Computing basis (S OLVE I NFO B ASIS ) Basis 3 : The circumcenter is the trisector. (intersection of 3 linear bisectors, enough to consider any two of them). c ∗ = l 12 × l 13 = l 12 × l 23 = l 13 × l 23 , l ij : projective point associated to the linear bisector Bisector ( p i , p j ) ( × : cross-product) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  14. Computing basis (S OLVE I NFO B ASIS ) Basis 2 : Either minimize D F ( c , p ) s.t. c ∗ ∈ Bisector ( p , q ) , or better perform a logarithmic search on λ ∈ [ 0 , 1 ] s. t. r λ = ∇ F − 1 (( 1 − λ ) ∇ F ( p ) + λ ∇ F ( q )) is on the geodesic of pq ( ∇ F − 1 : reciprocal gradient). F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  15. Live Demo http://www.csl.sony.co.jp/person/nielsen/ BregmanBall/MINIBALL/ F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  16. Statistical application example Univariate Normal law distribution: 2 π exp ( − ( x − µ ) 2 1 N ( x | µ, σ ) = 2 σ 2 ) . √ σ Consider the Kullback-Leibler divergence of two distributions: x f ( x ) log f ( x ) � KL ( f , g ) = g ( x ) . Canonical form of an exponential family : 1 N ( x | µ, σ ) = 2 π Z ( θ ) exp {� θ , f ( x ) �} with: √ 2 θ 1 exp {− θ 2 Z ( θ ) = σ exp { µ 2 � − 1 2 σ 2 } = 4 θ 1 } , 2 f ( x ) = [ x 2 x ] T : sufficient statistics , θ = [ − 1 µ σ 2 ] T : natural parameters . 2 σ 2 F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  17. Kullback-Leibler of parametric exponential family is a Bregman divergence for F = log Z . KL ( θ p || θ q ) = D F ( θ p , θ q ) = � ( θ p − θ q ) , θ p [ f ] � + log Z ( θ q ) Z ( θ p ) � µ 2 x 2 � � � p + σ 2 Z ( θ p ) exp {� θ p , f ( x ) �} � x p θ p [ f ] = = x � Z ( θ p ) exp {� θ p , f ( x ) �} µ p x Bisector � ( θ p − θ q ) , θ c [ f ] � + log Z ( θ p ) Z ( θ q ) = 0 . 1D Gaussian distribution: change variables ( µ, σ ) → ( µ 2 + σ 2 , µ ) = ( x , y ) (with x > y > 0). y 2 x − y 2 exp { � It comes Z ( x , y ) = 2 ( x − y 2 ) } , y 2 x − y 2 + � log Z ( x , y ) = log 2 ( x − y 2 ) and y 2 y 3 1 ∇ F ( x , y ) = ( 2 ( x − y 2 ) − ( x − y 2 ) 2 ) . 2 ( x − y 2 ) , F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  18. Statistical application example (cont’d) M IN M AX : ( µ ∗ , σ ∗ ) ≃ ( 2 . 67446 , 1 . 08313 ) and r ∗ ≃ 0 . 801357, M IN A VG : ( µ ∗ ′ , σ ∗ ′ ) = ( 2 . 40909 , 1 . 10782 ) . � � σ 2 σ i − 1 + ( µ j − µ i ) 2 j + 2 log σ j Note that KL ( N i , N j ) = 1 i . 2 σ 2 σ 2 j F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  19. Java Applet online: www.csl.sony.co.jp/person/nielsen/BregmanBall/ MINIBALL/ Source code: Basic MiniBall, Line intersection by projective geometry Visual Computing: Geometry, Graphics, and Vision , ISBN 1-58450-427-7, 2005. In high dimensions, extend B˘ adoiu & Clarkson core-set See On approximating the smallest enclosing Bregman Balls (SoCG’06 video) F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  20. 3D Bregman balls (video) Relative entropy (KL) Itakura-Saito F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

  21. Bregman Voronoi/Delaunay EXP F. Nielsen and R. Nock On the Smallest Enclosing Information Disk

Recommend


More recommend