 
              Perceptually Aware Displays Camera associated with display Perceptive Context for Pervasive Display should respond to user Computing - font size - attentional load Camera - passive acknowledgement Trevor Darrell Vision Interface Group Display MIT AI Lab e.g., “Magic Mirror”, Interval Compaq’s Smart Kiosk ALIVE, MIT Media Lab Example: A Face Responsive Display A Face Responsive Display • Faces are natural interfaces! Tasks - Ubiquitous, fast, expressive, general. - Detection - Want machines to generate and perceive faces. - Identification • A Face Responsive Display... - Tracking - Knows when it’s being observed How? Exploit multiple visual modalities: - Recognizes returning observers - Shape - Tracks head pose - Color - Robust to changing lighting, moving backgrounds… - Pattern Tasks and Visual Modalities Mode and Task Matrix shape color pattern shape color pattern silhouette silhouette detection skin classifier face detection detection skin classifier face detection classifier classifier identification biometrics flesh hue face recognition identification biometrics flesh hue face recognition fine motion clothing Appearance coarse motion clothing tracking Shape change tracking estimation / pose estimation histogram histogram change tracking
Flesh color tracking Finding Features 2D Head / hands localization • Often the simplest, fastest face detector! - contour analysis: mark extremal points (highest curvature or • Initialize region of hue space distance from center of body) as hand features - use skin color model when region of hand or face is found (color model is independent of flesh tone intensity) [ Crowley, Coutaz, Berard, INRIA ] Flesh color tracking Color Processing Can use Intel OpenCV lib’s CAMSHIFT algorithm for robust • Train two-class classifier with examples of skin and not real-time tracking. skin (open source impl. avail.!) • Typical approaches: Gaussian, Neural Net, Nearest Neighbor • Use features invariant to intensity Log color-opponent [Fleck et al.] (log(r) - log(g), log(b) - log((r+g)/2) ) Hue & Saturation [ Bradsky, Intel ] Intel’s computer vision library Detection with multiple visual modes Shape Find head sized peaks in 2-D or 3-D. Flesh Color Detect skin pigment in Detection hue-based color space Classify intensity vector Face Pattern corresponding to face class Detection
Common Detection Failure Modes Robust real-time performance Shape Shape Fooled by head shaped peaks Integrated Face Flesh Color Flesh Color Detection Algorithm Fooled by flesh colored objects Detection Detection (temporally asynch. voting scheme) Face Pattern Face Pattern Misses out of plane rotation Detection Detection or expression Mode and Task Matrix A Key Technology: Video-Rate Stereo • Two cameras −> stereo range estimation; disparity shape color pattern proportional to depth • Depth makes tracking people easy silhouette detection skin classifier face detection - segmentation classifier - shape characterization - pose tracking identification biometrics flesh hue face recognition • Real-time implementations becoming commercially available. clothing Appearance tracking Shape change histogram change Video-rate stereo RGBZ input Computed Foreground pixels; grouped by disparity local connectivity Left and right images
RGBZ input RGBZ input Range feature for ID! Color feature for ID! For long-term tracking / identification, measure color hue and saturation • Body shape characteristics -- e.g., height measure. values of hair and skin…. • Normalize for motion/pose: median filter over time Trevor Mike Gaile • Near future: full vision-based kinematic estimation and tracking-- Gaile Mike Trevor active research topic in many labs. For same-day ID, use histogram of entire body / clothing Mode and Task Matrix Robust, Multi-modal Algorithm Combine modules for detection: shape color pattern • Silhouette finds body silhouette • Color tracks extremities detection skin classifier face detection classifier • Pattern discriminates head from hands. Use each also to recognize returning people: identification biometrics flesh hue face recognition • Face recognition • Biometrics (skeletal structure) clothing Appearance tracking Shape change histogram change • Hair and Skin hue • Clothing (intra-day.) See lectures by Trevor later in the course [ CVPR ‘98; T. Darrell, G. Gordon, M. Harville, J. Woodfill ]
Classic Background Subtraction model System Overview • Background is assumed to be mostly static • Each pixel is modeled as by a gaussian distribution in YUV space • Model mean is usually updated using a recursive low- pass filter Given new image, generate silhouette by marking those pixels that are significantly different from the “background” value. Static Background Modeling Examples Static Background Modeling Examples [MIT Media Lab Pfinder / ALIVE System] [MIT Media Lab Pfinder / ALIVE System] Static Background Modeling Examples The ALIVE System Camera User Video Screen Autonomous Agents [MIT Media Lab Pfinder / ALIVE System]
ALIVE system, MIT ALIVE • Real sensing for virtual world • Tightly coupled sensing-behavior-action • Vision routines: body/head/hand tracking Vision Behaviors / Goals Camera Kinematics / Rendering Projector User Agents http://vismod.www.media.mit.edu/cgi-bin/tr_pagemaker (TR 257) [ Blumberg, Darrell, Maes, Pentland, Wren, … 1995 ] A Face Responsive Display Video Display Stereo Cameras http://vismod.www.media.mit.edu/cgi-bin/tr_pagemaker (TR 257) Vision-only Application: end Interactive Video Effects
Recommend
More recommend