1
play

1 Top-down segmentation Basic ideas of grouping in human vision - PDF document

Segmentation by Clustering Segmentation by Clustering Reading: Chapter 14 (skip 14.5) Data reduction - obtain a compact representation for interesting image data in terms of a set of components Find components that belong together (form


  1. Segmentation by Clustering Segmentation by Clustering Reading: Chapter 14 (skip 14.5) • Data reduction - obtain a compact representation for interesting image data in terms of a set of components • Find components that belong together (form clusters ) • Frame differencing - Background Subtraction and Shot Detection Slide credits for this chapter: David Forsyth, Christopher Rasmussen Segmentation by Clustering Segmentation by Clustering From: Object Recognition as Machine Translation, Duygulu, Barnard, de Freitas , Forsyth, ECCV02 General ideas • Tokens • Bottom up segmentation – whatever we need to – tokens belong together group (pixels, points, because they are surface elements, etc., locally coherent etc.) • These two are not • Top down segmentation mutually exclusive – tokens belong together because they lie on the same object Why do these tokens belong together? 1

  2. Top-down segmentation Basic ideas of grouping in human vision • Figure-ground • Gestalt properties discrimination – Psychologists have – grouping can be seen studies a series of in terms of allocating factors that affect some elements to a whether elements figure, some to ground should be grouped together – Can be based on local bottom-up cues or high • Gestalt properties level recognition 2

  3. Elevator buttons in Berkeley Computer Science Building Segmentation as clustering • Cluster together (pixels, • Point-Cluster distance tokens, etc.) that belong – single-link clustering together – complete-link • Agglomerative clustering clustering – merge closest clusters – group-average – repeat clustering “Illusory • Divisive clustering • Dendrograms Contours” – split cluster along best – yield a picture of boundary output as clustering process continues – repeat Feature Space Dendrogram from Agglomerative Clustering • Every token is identified by a set of salient visual characteristics called features . For example: – Position – Color – Texture – Motion vector – Size, orientation (if token is larger than a pixel) • The choice of features and how they are quantified implies a feature space in which each token is represented by a point • Token similarity is thus measured by distance between points (“feature vectors”) in feature space Instead of a fixed number of clusters, the dendrogram represents a hierarchy of clusters Slide credit: Christopher Rasmussen 3

  4. K-Means K -Means Clustering Minimizing squared distances to the center implies that the Initialization: Given K categories, N points in feature space. • center is at the mean: Pick K points randomly; these are initial cluster centers (means) m 1 , …, m K . Repeat the following: 1. Assign each of the N points, x j , to clusters by nearest m i (make sure no cluster is empty) Derivative of 2. Recompute mean m i of each cluster from its member error is zero at the points minimum 3. If no mean has changed, stop • Effectively carries out gradient descent to minimize:   ∑ ∑ 2 x j − µ i     i ∈ clusters j ∈ elements of i'th cluster Slide credit: Christopher Rasmussen Image Example: 3-means Clustering Clusters on intensity Clusters on color K-means clustering using intensity alone and color alone ����� ���� ������ ���������������������� 4

  5. Technique: Background Subtraction • Approach: • If we know what the background looks like, it – use a moving average is easy to segment out new to estimate background regions image • Applications – subtract from current – Person in an office frame – Tracking cars on a road – large absolute values are interesting pixels – Surveillance – Video game interfaces Algorithm Background Subtraction video sequence background • The problem: Segment moving foreground objects from static frame difference thresholded frame diff background for t = 1:N Update background model Compute frame difference ������������������������������� ������������� ���������������� ����������������� Threshold frame difference Noise removal end Objects are detected where is non-zero ������������������� ������� Slide credit: Christopher Rasmussen Background Modeling • Offline average – Pixel-wise mean values are computed during training phase (also called Mean and Threshold) • Adjacent Frame Difference – Each image is subtracted from previous image in sequence • Moving average – Background model is linear weighted sum of previous frames 5

  6. Results & Problems for Simple Approaches Background Subtraction: Issues Application: Sony Eyetoy • Noise models – Unimodal : Pixel values vary over time even for static scenes – Multimodal : Features in background can “oscillate”, requiring models which can represent disjoint sets of pixel values (e.g., waving trees against sky) • Gross illumination changes – Continuous: Gradual illumination changes alter the appearance of the background (e.g., time of day) – Discontinuous: Sudden changes in illumination and other scene parameters alter the appearance of the background (e.g., flipping a • For most games, this apparently uses simple frame light switch differencing to detect regions of motion • However, some applications use background subtraction to • Bootstrapping – Is a training phase with “no foreground” necessary, or can the cut out an image of the user to insert in video system learn what’s static vs. dynamic online? • Over 4 million units sold Slide credit: Christopher Rasmussen Technique: Shot Boundary Detection • Find the shots in a • Distance measures sequence of video – frame differences – shot boundaries usually – histogram differences result in big differences – block comparisons between succeeding – edge differences frames • Applications • Strategy – representation for movies, – compute interframe or video sequences distances • obtain “most – declare a boundary representative” frame where these are big – supports search 6

Recommend


More recommend