department of computer science iv
play

Department of Computer Science IV University of Mannheim, Germany - PowerPoint PPT Presentation

Stephan Kopf Department of Computer Science IV University of Mannheim, Germany Motivation Part I: Basic Retargeting Operations Scaling and cropping Regions of interest Automatic crop & scale Sports video adaptation


  1. Stephan Kopf Department of Computer Science IV University of Mannheim, Germany

  2.  Motivation  Part I: Basic Retargeting Operations ◦ Scaling and cropping ◦ Regions of interest ◦ Automatic crop & scale ◦ Sports video adaptation  Part II: Seam Carving ◦ Seam carving for images ◦ Preservation of straight lines ◦ Fast seam carving for videos  Summary Stephan Kopf 15.02.2011 2

  3.  Mobile phones are multimedia devices that allow to ◦ browse the Web ◦ display images and videos ◦ support novel input technologies (multi-touch)  But they still have limitations: ◦ Small screen size ◦ Wireless connection (bandwidth) ◦ Computational power (CPU, memory) ◦ Battery Stephan Kopf 15.02.2011 3

  4.  Typical resolutions of images and videos ◦ Digital camera: 10 megapixels (3.600 x 2.700 pixels) ◦ Camcorder: high definition (1.920 x 1.080 pixels) ◦ Mobile phone (240 x 320 pixels) HD video mobile phone  Bitrate: 24 Mbit/s  Distortions caused by scaling (aspect ratio) Stephan Kopf 15.02.2011 4

  5. Goal als of media dia retar arget getin ing  Shrink photos and videos for the presentation on a mobile phone (this automatically limits the bitrate)  Keep aspect ratio  Preserve the most important visual content  Algorithms for image and video retargeting Stephan Kopf 15.02.2011 5

  6. 6

  7.  Shrink image (merge pixels) by a fixed scale factor (uniform scaling)  Different scale factors for each axis change the aspect ratio (non-uniform scaling)  Relevance of image content is ignored  „Letterboxing“ is used to preserve aspect ratio  Example: Stephan Kopf 15.02.2011 7

  8.  Crop image borders until aspect ratios of image and display match  Relevance of image content is ignored: important content may be lost  Typically use scaling to convert to target size  Example: Stephan Kopf 15.02.2011 8

  9. Idea ea  Identify most relevant image regions (regions of interest)  Crop borders but preserve regions of interest  Use automatic algorithms to identify regions of interest: ◦ Saliency maps ◦ Faces ◦ Text regions Stephan Kopf 15.02.2011 9

  10.  Assumption: image regions that are relevant for an observer have a high contrast  Step 1: Contrast map of an image of size n × m : color of a pixel: p i ,j pixel in local neighborhood of p i ,j : distance function: d ( . )  Step 2: Quantize contrast map  Step 3: Find connected regions  Step 4: Mark region of interest *Source: Ma and Zhang HJ: Contrast-based image attention analysis by using fuzzy growing, ACM Intl. Conf. on Multimedia, 2003 Stephan Kopf 15.02.2011 10

  11. contrast map quantized contrast map region of interest bounding box Stephan Kopf 15.02.2011 11

  12.  Use automatic face detection algorithms to localize face regions  Frontal face detection algorithms work very robust (in contrast to face recognition) Stephan Kopf 15.02.2011 12

  13.  Characteristic features of text: ◦ horizontal alignment ◦ significant luminance difference between text and background ◦ the character size is within a certain range ◦ single-colored ◦ text is visible in consecutive frames (video) ◦ horizontal or vertical motion is possible (video)  Calculate a horizontal projection profile to detect the boundaries of text lines Stephan Kopf 15.02.2011 13

  14.  Calculate importance value V for each region of size H : minimum perceptible size: H min maximum reasonable size: H max  Find optimal target region W based on regions of interest S i : Stephan Kopf 15.02.2011 14

  15. Selection of one feature Combination of two features … three features Full image Stephan Kopf 15.02.2011 15

  16. scaling crop & scale cropping Stephan Kopf 15.02.2011 16

  17. scaled video modify video content Automatically detect:  Court lines *Source: Kopf, Guthier, Farin, Han:  Players Analysis and Retargeting of Ball Sports Video, IEEE Workshop on Applications  Ball of Computer Vision, 2011 Stephan Kopf 15.02.2011 17

  18.  Step 1: Mark bright pixels (line pixels)  Step 2: Algorithm to detect straight lines (based on RANSAC) 1. Randomly select two line pixels and calculate line parameters 2. Count number of white pixels N located on line 3. If ( N N > threshold) stop 4. Goto 1.  Step 3: Remove line pixels and detect next line (Step 2) RANSAC: Fischler, Bolles: Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography, Communications ACM, vol 24(6), 1981. Stephan Kopf 15.02.2011 18

  19.  Problem: Position of lines change from frame to frame  Solution: use a reference court model to estimate camera motion ◦ Step 1: Calculate intersection points of two lines ◦ Step 2: Transform lines to court model  How many intersection points do we need for the transformation? Stephan Kopf 15.02.2011 19

  20.  Translation (horizontal/vertical shift)  1 intersection point  Translation and scaling  2 intersection points  Affine transform (translation, scaling, rotation)  3 intersection points  Perspective transform  4 intersection points Stephan Kopf 15.02.2011 20

  21. cropping scaling crop & scale (zoom on largest player) modify lines & ball Stephan Kopf 15.02.2011 21

  22. 22

  23.  If important content is located near image borders:  crop & scale is not applicable Idea ea of f seam am carvin ving* g*  Systematic removal of less important pixels  Use energy function as measurement of „importance“ of single pixels *Source: Shai Avidan and Ariel Shamir: Seam Carving for Content-Aware Image Resizing. ACM SIGGRAPH, 2007 Stephan Kopf 15.02.2011 23

  24. Image width should be reduced by 40 percent original image energy map Stephan Kopf 15.02.2011 24

  25.  Remove N pixels with the lowest energy from each line source image remove N=200 pixels from each line based on energy values Stephan Kopf 15.02.2011 25

  26.  Summarize energy in each column of the image and remove N columns with lowest energy remove 200 columns original image based on energy values of columns Stephan Kopf 15.02.2011 26

  27.  A vertical seam is an 8-connected path of pixels from top to bottom that contains one and only one pixel in each row.  Formal definition:   x x n n s = {s } = {(x(i), i)} , subject to i : | x(i) - x(i - 1) | 1   i i 1 i 1  Horizontal seams are defined in a analog way. Stephan Kopf 15.02.2011 27

  28.  Advantage of seams compared to columns or rows: ◦ Pixels of low energy are removed ◦ Relevant objects are preserved Stephan Kopf 15.02.2011 28

  29.  Remove the vertical seam with the lowest energy  Repeat this step N times remove N=200 seams source image based on lowest energy Stephan Kopf 15.02.2011 29

  30.  Seam carving uses an energy function that characterizes the relevance of each pixel (similar to saliency maps).  The optimal seam minimizes the cumulated pixel energy of all seam pixels.  Method to find optimal seam: dynamic programming Stephan Kopf 15.02.2011 30

  31.  M ( i, j ) specifies the cost of the optimal (vertical) seam from the upper image border to pixel position (i , j )  Calculate M( i, j ) recursively:    ( 1 , 1 ) M i j      ( , ) ( , ) min ( 1 , ) M i j e i j M i j    ( 1 , 1 ) M i j  Stephan Kopf 15.02.2011 31

  32.  Example how to calculate the optimal seam: 2 5 1 4 2 5 1 1 4 1 2 3 4 3 3 3 4 5 1 2 3 3 4 5 6 6 7 5 4 4 1 9 8 9 7 7 energy map cumulated energy map M( i, j )    ( 1 , 1 ) M i j      ( , ) ( , ) min ( 1 , ) M i j e i j M i j    ( 1 , 1 ) M i j  Stephan Kopf 15.02.2011 32

  33.  Image gradient: simple energy function that calculates the luminance difference to adjacent pixels:     ( ( , )) ( , ) ( , ) e I x y I x y I x y   x y  Assumption: Luminance values do not differ much in image regions of low relevance  This simple energy function gives good results in many cases Stephan Kopf 15.02.2011 33

  34.  Problem: The light house is an important region, but the pixel values are very similar original image optimal seams result Stephan Kopf 15.02.2011 34

  35.  Combine energy function with saliency map    ( ( , )) ( , ) ( ( , )) e I x y w saliency x y e I x y sal s saliency map optimal seams result (e sal is used as energy function) Source: Hwang and Chien. Content-Aware Image Resizing using Perceptual Seam Carving with Human Attention Model. IEEE Conference on Multimedia and Expo, 2008. Stephan Kopf 15.02.2011 35

  36.  Use results from face detection as additional saliency:      ( ( , )) ( , ) ( , ) ( ( , )) e I x y w saliency x y w face x y e I x y  sal face s f saliency map face map seams based on result e sal+face as energy function Stephan Kopf 15.02.2011 36

Recommend


More recommend