CS6100: Topics in Design and Analysis of Algorithms Point Location - - PDF document

cs6100 topics in design and analysis of algorithms
SMART_READER_LITE
LIVE PREVIEW

CS6100: Topics in Design and Analysis of Algorithms Point Location - - PDF document

CS6100: Topics in Design and Analysis of Algorithms Point Location John Augustine CS6100 (Even 2012): Point Location Planar Subdivision Recall that a planar subdivision is an straight edge embedding of a (possibly disconnected) planar graph


slide-1
SLIDE 1

CS6100: Topics in Design and Analysis of Algorithms

Point Location John Augustine

CS6100 (Even 2012): Point Location

slide-2
SLIDE 2

Planar Subdivision

Recall that a planar subdivision is an straight edge embedding of a (possibly disconnected) planar graph that comprises of vertices, edges, and faces. In a DCEL representation, ⋆ Each edge is represented as two “half” edges.

  • Each vertex record of a vertex v

stores the coordinates and an arbitrary half edge originating from v.

  • Each face record of a face f stores
  • 1. a pointer to one of the half edges in its outer
  • boundary. This is null for the open face.
  • 2. A list of pointers to edges; for each “hole” inside

f, one half edge is chosen arbitrarily.

  • Each half edge bounds a face f, which is the face

to our left as we walk along the half edge. For each half edge e, we store

  • 1. A pointer to its origin vertex,
  • 2. A pointer to its twin half edge,
  • 3. A pointer to the face it bounds.

CS6100 (Even 2012): Point Location 1

slide-3
SLIDE 3

Point Location

Given for preprocessing: A planar subdivision S comprising n edges stored as a Doubly Connected Edge List (DCEL). Queries: Each query is a point q. We are to quickly report the face that contains q. Application: Clicking on an online map, should be able to report the country. Assumptions:

  • No two endpoints of edges lie on the same vertical

line.

  • Also, assume that the subdivision is enclosed by a

large axis parallel rectangle R. We say that such subdivisions are in general position.

CS6100 (Even 2012): Point Location 2

slide-4
SLIDE 4

Point Location: Easy Solution

Further subdivide S by adding vertical lines through each vertex. This creates several vertical slabs whose x coordinates can be preprocessed so that, given a query q, we can quickly find the slab in which it falls. Each slab consists of edges that go all the way across without intersecting each other. Therefore, they can be

  • rdered top to bottom. So a BBST can be constructed

for each slab as well. Given q, we can find the slab that contains q and then we can use the associated BBST to return the face in the slab that contains q. This in turn will lead us

CS6100 (Even 2012): Point Location 3

slide-5
SLIDE 5

to the face in S that contains q. The query time is O(log n). The complexity of the planar subdivision can increase significantly (to Θ(n2)) when vertical lines are added. Hence storage is Θ(n2) in the worst case.

n 4 slabs n 4

CS6100 (Even 2012): Point Location 4

slide-6
SLIDE 6

Trapezoidal Map aka Vertical Decomposition of S

From each vertex in S, draw vertical extensions, one going upward and the other going down. These extensions stop when they meet another segment (or boundary). The trapezoidal map T (S) is (i) the subdivision S, (ii) the rectangle R and (iii) the upper and lower vertical extensions.

R CS6100 (Even 2012): Point Location 5

slide-7
SLIDE 7

Properties of Faces in Trapezoidal Maps

First a definition. Recall that the subdivision is in general position. Nevertheless some

  • f

the edges around a face in T (S) may be adjacent and collinear; we merge such edges and call them a side.

sides

Lemma 1. Each face in a trapezoidal map T (S) has either one or two vertical sides and exactly 2 non- vertical sides. Proof Sketch. We first show that each face in T (S) is

  • convex. Simply look at the interior angles at each face.

They will all be < 180◦. Therefore, # of vertical sides is at most 2. Secondly, show that no two non-vertical sides of a face can be adjacent. This implies at most 2 non-vertical sides.

CS6100 (Even 2012): Point Location 6

slide-8
SLIDE 8

Four Objects that define a Face

Let ∆ be a face in T (S). Define top(∆) and bottom(∆) as below.

∆ top(∆) bottom(∆)

Observe that the left side (symmetrically, right side) can fall into one of five cases. Four cases are shown

  • below. The fifth case is when the left side is the left

edge of the bounding box R.

(a) (b) (c)

top(∆) bottom(∆) leftp(∆) leftp(∆) leftp(∆) top(∆) bottom(∆) top(∆) bottom(∆)

(d)

leftp(∆) top(∆) bottom(∆) s

Note that a single vertex leftp(∆) “defines” the left vertical edge. Similarly, we define rightp(∆).

CS6100 (Even 2012): Point Location 7

slide-9
SLIDE 9

Bounding the Complexity of T (S)

Lemma 2. The trapezoidal map of a set S of n line segments in general position contain (i) at most 6n+ 4 vertices and (ii) at most 3n + 1 trapezoids. Proof Sketch. (i) Note that each vertex in S spawns 2 vertices in T (S). Also, consider the bounding box. (ii) Focus on leftp(∆) for each ∆. A right end of a line segment in S is the leftp for at most 1 face. The left end of a line segment in S is the leftp for at most 2 faces. Finally, the bottom-left point of R can also be a leftp.

CS6100 (Even 2012): Point Location 8

slide-10
SLIDE 10

Representing Trapezoidal Maps

We say that two trapezoids ∆ and ∆′ are adjacent if they meet along a vertical edge. When S is in general position, each trapezoid has at most four neighbours: (i) lower left, (ii) upper left, (iii) lower right, and (iv) upper right. To store T (S), we have records for:

  • Line segments in S.
  • End points.
  • Adjacent trapezoids defined by leftp, rightp, top

and bottom. Furthermore, pointers to its four neighbours are also included. The geometry of the trapezoid can be deduced from this information.

CS6100 (Even 2012): Point Location 9

slide-11
SLIDE 11

A Randomized Incremental Algorithm for Constructing Point Location Datastructure

T (S) can be constructed easily by plane sweeping from left to right. However, when a query point is given to us, we need to navigate T (S) to find the correct trapezoid ∆q where q is located. For this, we will need an associated data structure, a directed acyclic graph (DAG) D. D has a single root node and leaves represent

  • trapezoids. Inner nodes of D have degree 2. There are

two types of inner nodes. The x-nodes correspond to vertices in S. The y nodes correspond to segments in S. When queried with a point q, we start at the root and traverse to a leaf which points to the trapezoid in S that contains q. There are two types of questions we ask at each node to guide our traversal. At an x-node, we ask if q is to the left or right of the

CS6100 (Even 2012): Point Location 10

slide-12
SLIDE 12

vertex associated with the x-node. If yes, we go to the left child, otherwise, we go to the right child. At a y-node, we ask if q is above or below the segment associated with the y-node. If yes, we go to the left child, otherwise, we go to the right child.

s1 s2 A B C D E F G p1 p2 q1 q2 s1 p1 q1 p2 s2 s2 q2 E A B C D F G

CS6100 (Even 2012): Point Location 11

slide-13
SLIDE 13

Overview of the Algorithm: Input: The set S of n non-crossing lines. Output: The trapezoidal map T (S) and an associated DAG D.

  • 1. Compute the bounding box R, which is a rectangular

box large enough to encompass all the segments in S, and compute T and D for R. D will be a single root which will also be the leaf.

  • 2. Compute a random permutation (s1, s2, . . . , sn) of

elements of S.

  • 3. Repeat the following steps for values of i from 1 to

n: (a) Find the sequence (∆0, ∆1, ∆k) of trapezoids in the current T that are intersected by si. (b) Remove them from T and replace them with new trapezoids. (c) Remove the leaves in D corresponding to the trapezoids in (∆0, ∆1, ∆k) and create leaves for new trapezoids. (d) Link the new leaves to D using some inner nodes.

CS6100 (Even 2012): Point Location 12

slide-14
SLIDE 14

Invariant: At ith step, we have T and D for Si = {s1, s2, . . . , si}. The initialization step in which we create T and D for R is easy, so we focus on the iterative steps. At each iteration i, we need to find the trapezoids intersected by the new segment si that is inserted.

∆0 ∆1 ∆2 ∆3 si

For this, we first locate the trapezoid ∆0 in which the left most point of si falls and then follows si left to right (with some care) in enumerate the trapezoids intersected by si. The details are in the pseudocode below

CS6100 (Even 2012): Point Location 13

slide-15
SLIDE 15

Algorithm FOLLOWSEGMENT(T,D,si)

  • Input. A trapezoidal map T, a search structure D for T, and a new segment si.
  • Output. The sequence ∆0,...,∆k of trapezoids intersected by si.

1. Let p and q be the left and right endpoint of si. 2. Search with p in the search structure D to find ∆0. 3. j ← 0; 4. while q lies to the right of rightp(∆ j) 5. do if rightp(∆ j) lies above si 6. then Let ∆j+1 be the lower right neighbor of ∆j. 7. else Let ∆j+1 be the upper right neighbor of ∆j. 8. j ← j +1 9. return ∆0,∆1,...,∆ j

The leaf in D associated with each trapezoid intersected by si is then deleted and each new trapezoid created as a consequence of inserting si is added as a leaf to D. Furthermore, each newly added leaf has to be connected to the rest of D via some internal nodes. To see how this is done, we see a couple of examples.

CS6100 (Even 2012): Point Location 14

slide-16
SLIDE 16

Consider first the example where si is fully contained in some trapezoid in T .

∆ T(Si−1) si pi qi A B C D T(Si)

This creates four new trapezoids (while eliminating

  • ne, i.e., the one in which si falls).

The leaf in D corresponding to the eliminated trapezoid ∆ is deleted from D and 4 new leaves are added (corresponding to trapezoids A, B, C, and D shown above). See below for how these four leaves are connected in the place of the leaf corresponding to ∆.

CS6100 (Even 2012): Point Location 15

slide-17
SLIDE 17

∆ D(Si−1) si D(Si−1) D(Si) pi qi A C D B

Now we turn

  • ur

attention to the case where the line segment si spans several trapezoids. Consider the example shown below.

T(Si−1) ∆0 ∆1 ∆2 ∆3 si pi qi T(Si) A B D C E F si

Trapezoids ∆0, ∆1, ∆2, and ∆3 are pierced by si. Hence, they are eliminated and new trapezoids A, B, C, D, E, and F are created. The manner in which D

CS6100 (Even 2012): Point Location 16

slide-18
SLIDE 18

is updated is given shown below.

D(Si) D(Si−1) qi si si si si B A C D E F D(Si−1) ∆0 ∆1 ∆2 ∆3

Notice that the leaves corresponding to ∆0, ∆1, ∆2, and ∆3 are deleted and leaves corresponding to new trapezoids A, B, C, D, E, and F are created and attached by way of inner nodes designed specifically to navigate T if a query point were to fall in one of the new trapezoids.

CS6100 (Even 2012): Point Location 17

slide-19
SLIDE 19

Analysis

Theorem 1. The algorithm described above is correct when given a set S of n line segments in general

  • position. The expected size of the search structure is

O(n) and it can be computed in expected O(n log n)

  • time. Each query takes O(log n) time on expectation.
  • Proof. The correctness follows from the invariant that

when the first i line segments Si are processed, we have a data structure that can correctly answer point location queries on the planar subdivision induced by Si. To analyse the performance bounds, we want to focus

  • n the expected behaviour when the data structure is

built using one of n! equally likely permutations of S. Let (s1, s2, . . . , si, . . . , sn) be the sequence of line segments processed in creating T and D. For 1 ≤ i ≤ n, we use Si to denote (s1, s2, . . . , si).

CS6100 (Even 2012): Point Location 18

slide-20
SLIDE 20

Query Time Analysis. Fix q and bound path length traversed. In every iteration of the query algorithm, path length increases by 3 in D. ∴ 3n is worst case, but we want expected behaviour. Focus on the path Pq traversed in D for query q. Define, for 1 ≤ i ≤ n, Xi = # of nodes in Pq created when si was added. Notice that

  • ur

query time is proportional to E[n

i=1 Xi].

Why? By linearity of expectation, E[

n

  • i=1

Xi] =

n

  • i=1

E[Xi] Let Pi be the probability that some node in Pq was

CS6100 (Even 2012): Point Location 19

slide-21
SLIDE 21

created when si was added. Then, E[Xi] ≤ 3Pi. Let ∆q(Si), for i ≤ i ≤ n, be the trapezoid in T (Si) that contains q. Notice that Pi = Pr[∆q(Si) = ∆q(Si−1)] Why? All trapezoids created in iteration i are adjacent to si. ∆q(Si) is the same regardless of the sequence in which the first i segments were added. To bound Pi, we need to check the probability with which the trapezoid containing q was created in iteration i. How? Mnemonic: Creation is hard, but destruction is easy! So to bound Pi, we seek the probability that removing si removes ∆q(Si)? What Technique is this?

CS6100 (Even 2012): Point Location 20

slide-22
SLIDE 22

Since each permutation of Si is equally likely, any one

  • f the line segments in Si is equally likely to be si.

∆q(Si) disappears if and only if one of top(∆q(Si)), bottom(∆q(Si)), leftp(∆q(Si)),

  • r

rightp(∆q(Si)) disappears with removal of si. ∆q(Si) has a uniquely defined top and bottom, therefore, each of them disappear with probability 1/i. Now lets focus on leftp(∆q(Si)). Consider the following cases.

(a) (b) (c)

top(∆) bottom(∆) leftp(∆) leftp(∆) leftp(∆) top(∆) bottom(∆) top(∆) bottom(∆)

(d)

leftp(∆) top(∆) bottom(∆) s

In cases (a), (b), and (c), if the leftp disappeared, then so did either a top or a bottom. So we only need to consider case (d). Here, leftp is the right endpoint of si. Therefore, we can conclude that this case happens with probability 1/i.

CS6100 (Even 2012): Point Location 21

slide-23
SLIDE 23

A symmetric argument works for rightp as well. Therefore, Pi = Pr[∆q(Si) = ∆q(Si−1)] = Pr[∆q(Si) ∈ T (Si−1)] ≤ 4/i. Putting it all together, E[

n

  • i=1

Xi] ≤

n

  • i=1

3Pi ≤

n

  • i=1

12 i ≤ 12

n

  • i=1

1 i = 12Hn, where Hn is the nth harmonic number, which we know to be O(log n), thereby establishing that the query time is O(log n) on expectation.

CS6100 (Even 2012): Point Location 22

slide-24
SLIDE 24

Space required. The complexity of T (S) is O(n) (cf. Lemma 2), so we focus on size of D. The number of leaves is O(n), so we need to bound the number of inner nodes. I.e., size of D is O(n)+

n

  • i=1

(Number of innner nodes created in iteration i). Let ki be # of new trapezoids in ith iteration (due to insertion of si). The number of new nodes created is exactly ki − 1. Why? Size of D is O(n) +

n

  • i=1

(ki − 1) ≤ O(n) +

n

  • i=1

ki. However, we are interested in expected size. Therefore, expected size of D (applying linearity of expectation)

CS6100 (Even 2012): Point Location 23

slide-25
SLIDE 25

is O(n) + E[

n

  • i=1

ki] = O(n) +

n

  • i=1

E[ki]. Again, we use backward analysis. Let Si ⊆ S. For some ∆ ∈ T (S) and some s ∈ Si, let δ(∆, s) = 1 if ∆ disappears from T if s is removed, else, 0. Note that

  • s∈Si
  • ∆∈T (Si)

δ(∆, s) ≤ 4|Si| ∈ O(i). because each ∆ can be removed by at most one of 4 lines. Note that E[ki] = 1 i

  • s∈Si
  • ∆∈T (Si)

∈ O(i) i = O(1).

CS6100 (Even 2012): Point Location 24

slide-26
SLIDE 26

Therefore, total space occupied by D is O(n). Space required. Note that the time to insert a segment si is O(ki) plus the time to locate the left endpoint, which takes O(log n). Therefore, total time required is

n

  • i=1

{O(log i) + O(E[ki])} = O(n log n). This completes the proof.

CS6100 (Even 2012): Point Location 25