Perspective click-and-drag area selections in pictures Frank NIELSEN www.informationgeometry.org Sony Computer Science Laboratories, Inc. Machine Vision Applications (MVA) 21st May 2013 c � 2013 Frank Nielsen 1/30
Traditional click and drag rectangular selection → Fails for selecting parts in photos: c � 2013 Frank Nielsen 2/30
Traditional click and drag rectangular selection → Fails for selecting parts in photos: Cannot capture “New” without part of “Court”. Man-made environments: many perspectively slanted planar parts. c � 2013 Frank Nielsen 3/30
Perspective click’n’drag Intelligent UI (= computer vision + human computer interface) → Image “parsing” of perspective rectangles (automatic/semi-automatic/manual) c � 2013 Frank Nielsen 4/30
Video demonstrations Perspective click-and-drag + perspective copy/paste/swap c � 2013 Frank Nielsen 5/30
Perspective click’n’drag: Outline 1. Preprocessing: Detect & structure perspective parts 1.1 Quad detector: ◮ Image segmentation ◮ Outer contour quad fitting ◮ Quad recognition 1.2 Quad homography tree 2. Interactive user interface: Perspective quad selection based on click-and-drag UI (=diagonal selection) 3. Application example: Interactive image editing (swap) c � 2013 Frank Nielsen 6/30
Preprocessing workflow c � 2013 Frank Nielsen 7/30
Quad detection: Sobel/Hough transform How to detect convex quads in images? indoor robotics [6] using vanishing point. Limitations of Hough transform [8] on Sobel image: Combinatorial line arrangement O ( n 4 )... → good for limited number of detected lines (blackboard detection [8], name card detection, etc.) c � 2013 Frank Nielsen 8/30
Quad detection: Image segmentation (SRM) → Fast Statistical Region Merging [4] (SRM) Source codes in Java TM , Matlab R � , Python R � , C, etc. c � 2013 Frank Nielsen 9/30
Quad detection: Image segmentation (SRM) c � 2013 Frank Nielsen 10/30
Quad detector ◮ For each segmented region, consider its exterior contour C (polygon), ◮ Compute the contour diameter, P 1 P 3 , ◮ Compute the upper most P 2 and bottom most P 4 extremal points ◮ Calculate the symmetric Haussdorf distance between quad Q = ( P 1 , P 2 , P 3 , P 4 ) and contour C , ◮ Accept region as quad when distance falls below as prescribed threshold. All quads convex and clockwise oriented . c � 2013 Frank Nielsen 11/30
Quad detection: Image segmentation (SRM) ... any closed contour image segmentation, → run at different scales (eg., parameter Q in SRM). Alternatively, can also use mean-shift [9], normalized cuts [7], etc. Why? To increase the chance of detecting for some parameter tuning quads. → We end up with a quad soup c � 2013 Frank Nielsen 12/30
Multi-segmentation Increases the chance of recognizing quads, but get a quad soup. Q = 128 Q = 10 Q = 0 . 3 Q = 0 . 25 c � 2013 Frank Nielsen 13/30
Nested convex quad hierarchy ◮ From a quad soup, sort the quads in decreasing order of their area in a priority queue. ◮ Add image boundary quad Q 0 as the quad root of the quad tree Q . ◮ Greedy selection: Add a quad of the queue if and only if it is fully contained in another quad of Q . ◮ When adding a quad Q i , compute the homographies [2] H i and H − 1 of the quad to the unit square. i c � 2013 Frank Nielsen 14/30
Do not explicit unwarp perspective rectangles Many existing systems first unwarp... source segmented unwarped Mobile cell phone signage recognition [5], AR systems, etc. c � 2013 Frank Nielsen 15/30
Perspective click’n’drag: User interaction Perspective sub-rectangle selection: Clicking on a corner p 1 and dragging the opposite corner p 3 . find the deepest quad Q in the quad hierarchy Q that contains both points p 1 and p 3 . Unit H square ¯ x ′ ¯ p 1 p ′ ˜ 2 = y ′ p ′ ˜ 1 = H ˜ p 1 p 2 ← p ′ 2 1 H − 1 perspective regular H dragging dragging p 4 ← p ′ 4 ˜ p ′ 3 = H ˜ p 3 x ′ p 3 p ′ ˜ 4 = y ′ H − 1 1 c � 2013 Frank Nielsen 16/30
Some examples of perspective click-and-drag selections Regular vs. perspective rectangle UI selection c � 2013 Frank Nielsen 17/30
Implementation details: Primitives on convex quads By convention, order quads clockwise. Positive determinant for the two quad-induced triangles: � x 1 − x 3 � �� x 2 − x 3 � � det = � � y 1 − y 3 y 2 − y 3 � � ◮ Predicate p ∈ Q = ( p 1 , p 2 , p 3 , p 4 )?: Two queries: p ∈ ( p 1 , p 2 , p 3 ) and p ∈ ( p 3 , p 4 , p 1 ). ◮ Area of a quad: One half of the absolute value of the determinant of the two quad triangles. c � 2013 Frank Nielsen 18/30
In class Quadrangle double area(Feature p1 , Feature p2 , Feature p3) { double res; res =(p1.x-p3.x)*(p2.y-p1.y) -(p1.x-p2.x)*(p3.y-p1.y); return 0.5*Math.abs(res); // half of determinant } double area() { return (area(p1 ,p2 ,p3)+area(p1 ,p3 ,p4)); } // // Clockwise or aligned order predicate // boolean CW(Feature a, Feature b, Feature c) { double det =(a.x-c.x)*(b.y-c.y) -(b.x-c.x)*(a.y-c.y); if (det >=0.0) { return true;} else { return false;} } // Determine if a pixel falls inside the quadrangle or not boolean inside(int x, int y) { Feature p=new Feature(x,y,1.0); if ( CW(p1 ,p2,p) && CW(p2 ,p3 ,p) && CW(p3 ,p4 ,p) && CW(p4 ,p1 ,p) ) { return true;} else { return false;} } c � 2013 Frank Nielsen 19/30
Homography estimation Projective geometry, homogeneous and inhomogeneous coordinates. x ′ ˜ ˜ h 11 h 12 h 13 x i i p ′ = = H ˜ y ′ ˜ i = ˜ h 21 h 22 h 23 y i ˜ p i , i w ′ h 31 h 32 h 33 w i i w ′ i = h 31 x i + h 32 y i + h 33 w i i = h 11 x i + h 12 y i + h 13 w i i = h 21 x i + h 22 y i + h 23 w i x ′ h 31 x i + h 32 y i + h 33 w i , y ′ h 31 x i + h 32 y i + h 33 w i . A i block matrix: x ′ i ( h 31 x i + h 32 y i + h 33 ) = h 11 x i + h 12 y i + h 13 , y ′ i ( h 31 x i + h 32 y i + h 33 ) = h 21 x i + h 22 y i + h 23 . Solve for A i h = 0 c � 2013 Frank Nielsen 20/30
Homography estimation using inhomogeneous system Assume h 33 � = 0 (and set h 33 = 1). − x 1 x ′ − y 1 x ′ x ′ x 1 y 1 1 0 0 0 h 11 1 1 1 − x 1 y ′ − y 1 y ′ y ′ 0 0 0 1 x 1 y 1 h 12 1 1 1 − x 2 x ′ − y 2 x ′ x ′ x 2 y 2 1 0 0 0 h 13 2 2 2 − x 2 y ′ − y 2 y ′ y ′ 0 0 0 1 x 2 y 2 h 21 2 2 2 = × − x 3 x ′ − y 3 x ′ x ′ x 3 y 3 1 0 0 0 h 22 3 3 3 − x 3 y ′ − y 3 y ′ y ′ 0 0 0 1 x 3 y 3 h 23 3 3 3 − x 4 x ′ − y 4 x ′ x ′ x 4 y 4 1 0 0 0 h 31 4 4 4 − x 4 y ′ − y 4 y ′ y ′ 0 0 0 x 4 y 4 1 h 32 4 4 4 � �� � h ′ Linear system written: Bh ′ = b . For four pairs h ′ = B − 1 b . c � 2013 Frank Nielsen 21/30
Homography estimation using the normalized DLT 9 � H = UDV T = λ i u i v ⊤ i , i =1 Right eigenvector of V corresponding to the smallest eigenvalue. (last column vector v 9 of V ) When λ 9 = 0, the system is exactly determined. When λ 9 > 0, the system is over-determined and λ 9 is an indicator of the goodness of fit of the solution h = v 9 . In practice, this estimation procedure is highly unstable numerically[2]. Points need to be first normalized to that their centroid defines the √ origin, and the diameter is set to 2. c � 2013 Frank Nielsen 22/30
Image editing: Selection swaps H 12 from Q 1 to Q 2 by com- position: H 12 = H 1 H − 1 2 H 21 = H − 1 12 = H 2 H − 1 1 → backward pixel mapping [3] (avoid holes) forward mapping backward mapping DEST → SRC ( H − 1 ) SRC → DEST ( H ) c � 2013 Frank Nielsen 23/30
Image editing: Selection swaps c � 2013 Frank Nielsen 24/30
Image editing: Selection swaps c � 2013 Frank Nielsen 25/30
Image editing: Selection swaps c � 2013 Frank Nielsen 26/30
Image editing: Selection swaps c � 2013 Frank Nielsen 27/30
Perspective Click-and-Drag UI: Conclusion ◮ Simple UI system relying on computer vision . ◮ Extend to other input formats: Stereo pairs, RGBZ images, etc. ◮ Implemented using processing.org (2500+ lines) Ongoing work: ◮ Rely on efficient quad detection : extensive benchmarking (BSDS500, Corel, ImageNet, etc. databases) ◮ Extend to various perspectively slanted shapes (like ball → ellipsoids, etc.) ◮ Robust multiple quad-to-square homography estimations [1]? www.informationgeometry.org c � 2013 Frank Nielsen 28/30
Recommend
More recommend