Approximating Covering and Minimum Enclosing Balls in Hyperbolic - - PowerPoint PPT Presentation

approximating covering and minimum enclosing balls in
SMART_READER_LITE
LIVE PREVIEW

Approximating Covering and Minimum Enclosing Balls in Hyperbolic - - PowerPoint PPT Presentation

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen 1 etan Hadjeres 2 Ga Ecole Polytechnique 1 Sony Computer Science Laboratories, Inc 1 , 2 Conference on Geometric Science of Information 2015 Frank


slide-1
SLIDE 1

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry

Frank Nielsen1 Ga¨ etan Hadjeres2

´ Ecole Polytechnique 1 Sony Computer Science Laboratories, Inc 1,2

Conference on Geometric Science of Information

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 1

slide-2
SLIDE 2

The Minimum Enclosing Ball problem

Finding the Minimum Enclosing Ball (or the 1-center) of a finite point set P = {p1, . . . , pn} in the metric space (X, dX(., .)) consists in finding c ∈ X such that c = argminc′∈X max

p∈P dX(c′, p)

Figure : A finite point set P and its minimum enclosing ball MEB(P)

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 2

slide-3
SLIDE 3

The approximating minimum enclosing ball problem

In a euclidean setting, this problem is

◮ well-defined: uniqueness of the center c∗ and radius R∗ of the

MEB

◮ computationally intractable in high dimensions.

We fix an ǫ > 0 and focus on the Approximate Minimum Enclosing Ball problem of finding an ǫ-approximation c ∈ X of MEB(P) such that dX(c, p) ≤ (1 + ǫ)R∗ ∀p ∈ P.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 3

slide-4
SLIDE 4

The approximating minimum enclosing ball problem: prior work

Approximate solution in the euclidean case are given by Badoiu and Clarkson’s algorithm [Badoiu and Clarkson, 2008]:

◮ Initialize center c1 ∈ P ◮ Repeat ⌊1/ǫ2⌋ times the following update:

ci+1 = ci + fi − ci i + 1 where fi ∈ P is the farthest point from ci. How to deal with point sets whose underlying geometry is not euclidean ?

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 4

slide-5
SLIDE 5

The approximating minimum enclosing ball problem: prior work

This algorithm has been generalized to

◮ dually flat manifolds [Nock and Nielsen, 2005] ◮ Riemannian manifolds [Arnaudon and Nielsen, 2013]

Applying these results to hyperbolic geometry give the existence and uniqueness of MEB(P), but

◮ give no explicit bounds on the number of iterations ◮ assume that we are able to precisely cut geodesics.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 5

slide-6
SLIDE 6

The approximating minimum enclosing ball problem: our contribution

We analyze the case of point sets whose underlying geometry is hyperbolic. Using a closed-form formula to compute geodesic α-midpoints, we

  • btain

◮ a intrinsic (1 + ǫ)-approximation algorithm to the approximate

minimum enclosing ball problem

◮ a O(1/ǫ2) convergence time guarantee ◮ a one-class clustering algorithm for specific subfamilies of

normal distributions using their Fisher information metric

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 6

slide-7
SLIDE 7

Model of d-dimensional hyperbolic geometry: The Poincar´ e ball model

The Poincar´ e ball model (Bd, ρ(., .)) consists in the open unit ball Bd = {x ∈ Rd : x < 1} together with the hyperbolic distance ρ (p, q) = arcosh

  • 1 +

2p − q2 (1 − p2) (1 − q2)

  • ,

∀p, q ∈ Bd. This distance induces on the metric space (Bd, ρ) a Riemannian structure.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 7

slide-8
SLIDE 8

Geodesics in the Poincar´ e ball model

Shorter paths between two points (geodesics) are exactly

◮ straight (euclidean) lines passing through the origin ◮ circle arcs orthogonal to the unit sphere

Figure : “Straight” lines in the Poincar´ e ball model

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 8

slide-9
SLIDE 9

Circles in the Poincar´ e ball model

Circles in the Poincar´ e ball model

◮ look like euclidean circles ◮ but with different center

Figure : Difference between euclidean MEB (in blue) and hyperbolic MEB (in red) for the set of blue points in hyperbolic Poincar´ e disk (in black). The red cross is the hyperbolic center of the red circle while the pink one is its euclidean center.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 9

slide-10
SLIDE 10

Translations in the Poincar´ e ball model

Tp (x) =

  • 1 − p2

x +

  • x2 + 2x, p + 1
  • p

p2x2 + 2x, p + 1

Figure : Tiling of the hyperbolic plane by squares

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 10

slide-11
SLIDE 11

Closed-form formula for computing α-midpoints

A point m is the α-midpoint p#αq of two points p, q for α ∈ [0, 1] if

◮ m belongs to the geodesic joining the two points p, q ◮ m verifies

ρ (p, mα) = αρ (p, q) .

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 11

slide-12
SLIDE 12

Closed-form formula for computing α-midpoints

A point m is the α-midpoint p#αq of two points p, q for α ∈ [0, 1] if

◮ m belongs to the geodesic joining the two points p, q ◮ m verifies

ρ (p, mα) = αρ (p, q) . For the special case p = (0, . . . , 0), q = (xq, 0, . . . , 0), we have p#αq := (xα, 0, . . . , 0) with xα = cα,q − 1 cα,q + 1, where cα,q := eαρ(p,q)

  • =

1 + xq 1 − xq α .

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 11

slide-13
SLIDE 13

Closed-form formula for computing α-midpoints

Noting that p#αq = Tp (T−p (p) #αT−p (q)) ∀p, q ∈ Bd we obtain

◮ a closed-form formula for computing p#αq ◮ how to compute p#αq in linear time O(d) ◮ that these transformations are exact.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 12

slide-14
SLIDE 14

(1+ǫ)-approximation of an hyperbolic enclosing ball of fixed radius

For a fixed radius r > R∗, we can find c ∈ Bd such that ρ (c, P) ≤ (1 + ǫ)r ∀p ∈ P with Algorithm 1: (1 + ǫ)-approximation of EHB(P, r)

1: c0 := p1 2: t := 0 3: while ∃p ∈ P such that p /

∈ B (ct, (1 + ǫ) r) do

4:

let p ∈ P be such a point

5:

α := ρ(ct,p)−r

ρ(ct,p)

6:

ct+1 := ct#αp

7:

t := t+1

8: end while 9: return ct

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 13

slide-15
SLIDE 15

Idea of the proof

By the hyperbolic law of cosines : ch (ρt) ≥ ch (h) ch (ρt+1) ch (ρ1) ≥ ch (h)T ≥ ch (ǫr)T . ct+1 ct c∗ pt h > ǫr ρt+1 ρt r′ ≤ r r θ θ′

Figure : Update of ct

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 14

slide-16
SLIDE 16

(1+ǫ)-approximation of an hyperbolic enclosing ball of fixed radius

The EHB(P, r) algorithm is a O(1/ǫ2)-time algorithm which returns

◮ the center of a hyperbolic enclosing ball with radius

(1 + ǫ)r

◮ in less than 4/ǫ2 iterations.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 15

slide-17
SLIDE 17

(1+ǫ)-approximation of an hyperbolic enclosing ball of fixed radius

The EHB(P, r) algorithm is a O(1/ǫ2)-time algorithm which returns

◮ the center of a hyperbolic enclosing ball with radius

(1 + ǫ)r

◮ in less than 4/ǫ2 iterations.

Our error with the true MEHB center c∗ verifies ρ (c, c∗) ≤ arcosh ch ((1 + ǫ) r) ch (R∗)

  • c

2015 Frank Nielsen - Ga¨ etan Hadjeres 15

slide-18
SLIDE 18

(1 + ǫ + ǫ2/4)-approximation of MEHB(P)

In fact, as R∗ is unknown in general, the EHB algorithm returns for any r:

◮ an (1 + ǫ)-approximation of EHB(P) if r ≥ R∗ ◮ the fact that r < R∗ if the result obtained after more than

4/ǫ2 iterations is not good enough.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 16

slide-19
SLIDE 19

(1 + ǫ + ǫ2/4)-approximation of MEHB(P)

In fact, as R∗ is unknown in general, the EHB algorithm returns for any r:

◮ an (1 + ǫ)-approximation of EHB(P) if r ≥ R∗ ◮ the fact that r < R∗ if the result obtained after more than

4/ǫ2 iterations is not good enough. This suggests to implement a dichotomic search in order to compute an approximation of the minimal hyperbolic enclosing

  • ball. We obtain

◮ a O(1 + ǫ + ǫ2/4)-approximation of MEHB(P) ◮ in O

N

ǫ2 log

1

ǫ

  • iterations.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 16

slide-20
SLIDE 20

(1 + ǫ + ǫ2/4)-approximation of MEHB(P) algorithm

Algorithm 2: (1 + ǫ)-approximation of MEHB(P)

1: c := p1 2: rmax := ρ (c, P); rmin = rmax

2 ; tmax := +∞

3: r := rmax; 4: repeat 5:

ctemp := Alg1

  • P, r, ǫ

2

  • , interrupt if t > tmax in Alg1

6:

if call of Alg1 has been interrupted then

7:

rmin := r

8:

else

9:

rmax := r ; c := ctemp

10:

end if

11:

dr := rmax−rmin

2

; r := rmin + dr ; tmax := log(ch(1+ǫ/2)r)−log(ch(rmin))

log(ch(rǫ/2))

12: until 2dr < rmin ǫ

2

13: return c

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 17

slide-21
SLIDE 21

Experimental results

◮ The number of iterations does not depend on d.

Figure : Number of α-midpoint calculations as a function of ǫ in logarithmic scale for different values of d.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 18

slide-22
SLIDE 22

Experimental results

◮ The running time is approximately O( dn ǫ2 ) (vertical translation

in logarithmic scale).

Figure : execution time as a function of ǫ in logarithmic scale for different values of d.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 19

slide-23
SLIDE 23

Applications

Hyperbolic geometry arises when considering certain subfamilies of multivariate normal distributions. For instance, the following subfamilies

◮ N

  • µ, σ2In
  • f n-variate normal distributions with scalar

covariance matrix (In is the n × n identity matrix),

◮ N

  • µ, diag
  • σ2

1, . . . , σ2 n

  • f n-variate normal distributions with

diagonal covariance matrix

◮ N(µ0, Σ) of d-variate normal distributions with fixed mean µ0

and arbitrary positive definite covariance matrix Σ are statistical manifolds whose Fisher information metric is hyperbolic.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 20

slide-24
SLIDE 24

Applications

In particular, our results apply to the two-dimensional location-scale subfamily:

Figure : MEHB (D) of probability density functions (left) in the (µ, σ) superior half-plane (right). P = {A, B, C}.

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 21

slide-25
SLIDE 25

Openings

Plugging the EHB and MEHB algorithms to compute clusters centers in the approximation algorithm by [Gonzalez, 1985], we

  • btain approximate algorithms for

◮ covering in hyperbolic spaces ◮ the k-center problem in O

kNd

ǫ2 log

1

ǫ

  • c

2015 Frank Nielsen - Ga¨ etan Hadjeres 22

slide-26
SLIDE 26

Algorithm 3: Gonzalez farthest-first traversal approximation algo- rithm

1: C1 := P,

i = 0

2: while i ≤ k do 3:

∀j ≤ i, compute cj := MEB(Cj)

4:

∀j ≤ i, set fj := argmaxp∈P ρ(p, cj)

5:

find f ∈ {fj} whose distance to its cluster center is maximal

6:

create cluster Ci containing f

7:

add to Ci all points whose distance to f is inferior to the distance to their cluster center

8:

increment i

9: end while 10: return {Ci}i

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 23

slide-27
SLIDE 27

Openings

The computation of the minimum enclosing hyperbolic ball does not necessarily involve all points p ∈ P.

◮ Core-sets in hyperbolic geometry

◮ the MEHB obtained by the algorithm is an ǫ-core-set ◮ differences with the euclidean setting: core-sets are of size at

most ⌊1/ǫ⌋ [Badoiu and Clarkson, 2008]

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 24

slide-28
SLIDE 28

Thank you!

c 2015 Frank Nielsen - Ga¨ etan Hadjeres 25

slide-29
SLIDE 29

Bibliography I

Arnaudon, M. and Nielsen, F. (2013). On approximating the Riemannian 1-center. Computational Geometry, 46(1):93–104.

Badoiu, M. and Clarkson, K. L. (2008). Optimal core-sets for balls.

  • Comput. Geom., 40(1):14–22.

Gonzalez, T. F. (1985). Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306.

Nock, R. and Nielsen, F. (2005). Fitting the smallest enclosing Bregman ball. In Machine Learning: ECML 2005, pages 649–656. Springer. c 2015 Frank Nielsen - Ga¨ etan Hadjeres 26