Introductions • Instructor : Prof. Kristen Grauman grauman@cs.utexas.edu Computer Vision • TA : Shalini Sahoo Jan 19, 2011 shalini@cs.utexas.edu What is computer vision? Today • Course overview • Requirements, logistics Done? 1. Vision for measurement Computer Vision Real-time stereo Structure from motion Tracking • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) NASA Mars Rover Demirdjian et al. Snavely et al. Wang et al. 1
2. Vision for perception, interpretation Computer Vision Objects amusement park sky Activities Scenes • Automatic understanding of images and video Locations The Wicked Cedar Point Text / writing Twister 1. Computing properties of the 3D world from visual Faces data (measurement) Gestures Ferris ride Motions 2. Algorithms and representations to allow a machine wheel ride Emotions… Emotions to recognize objects, people, scenes, and 12 E Lake Erie water activities. (perception and interpretation) ride tree tree people waiting in line people sitting on ride umbrellas tree maxair carousel deck bench tree pedestrians 3. Visual search, organization Computer Vision • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) 2. Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) 3. Algorithms to mine, search, and interact with visual data ( search and organization ) Query Image or video Relevant archives content Related disciplines Vision and graphics Artificial Images Vision Model intelligence Machine Graphics learning Graphics Computer Computer vision Image Cognitive processing science Inverse problems: analysis and synthesis. Algorithms 2
Visual data in 1963 Visual data in 2011 L. G. Roberts, Machine Perception Movies, news, sports Personal photo albums of Three Dimensional Solids, f Th Di i l S lid Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Medical and scientific images Surveillance and security Slide credit; L. Lazebnik Faces and digital cameras Why vision? • As image sources multiply, so do applications – Relieve humans of boring, easy tasks – Enhance human abilities – Advance human-computer interaction Advance human-computer interaction, visualization Setting camera – Perception for robotics / autonomous agents Camera waits for focus via face everyone to smile to detection – Organize and give access to visual content take a photo [Canon] Linking to info with a mobile device Video-based interfaces Situated search Yeh et al., MIT Y h t l MIT Assistive technology systems Human joystick, NewsBreaker Live Camera Mouse, Boston College kooaba MSR Lincoln Microsoft Kinect 3
Vision for medical & neuroimages What else? fMRI data Golland et al. Image guided surgery MIT AI Vision Group Special visual effects Safety & security Navigation, The Matrix The Matrix driver safety d i f t Monitoring pool (Poseidon) Mocap for Pirates of the Carribean , Industrial Light and Magic Source: S. Seitz What Dreams May Come Surveillance Pedestrian detection MERL, Viola et al. Obstacles? What the computer gets 4
Challenges: many nuisance parameters Why is vision difficult? • Ill-posed problem: real world much more complex than what we can measure in images – 3D 2D Illumination Object pose Clutter • Impossible to literally “invert” image formation • Impossible to literally invert image formation process Intra-class Occlusions Viewpoint appearance Challenges: importance of context Challenges: intra-class variation slide credit: Fei-Fei, Fergus & Torralba slide credit: Fei-Fei, Fergus & Torralba Challenges: complexity • Ok, vision is very challenging… • Yet also active research area with exciting • Thousands to millions of pixels in an image progress! • 3,000-30,000 human recognizable object categories • 30+ degrees of freedom in the pose of articulated objects (humans) … … … … … … … • Billions of images indexed by Google Image Search • 18 billion+ prints produced from digital camera images in 2004 • 295.5 million camera phones sold in 2005 • About half of the cerebral cortex in primates is … … … … … … devoted to processing visual information [Felleman and van Essen 1991] 5
Brainstorm Goals of this course 1. What functionality should the system have? • Upper division undergrad course 2. Intuitively, what are the technical sub-problems that must be solved? • Introduction to primary topics • Hands-on experience with algorithms • Views of vision as a research area … Features and filters Topics overview • Features & filters • Grouping & fitting • Multiple views and motion • Recognition • Video processing Focus is on algorithms, rather than specific systems. Transforming and describing images; textures, colors, edges Grouping & fitting Multiple views and motion Multi-view geometry, matching, invariant features, stereo vision Lowe Hartley and Zisserman [fig from Shi et al] Clustering, segmentation, fitting; what parts belong together? Fei-Fei Li 6
Recognition and learning Video processing Tracking objects, video analysis, low level motion, optical flow Recognizing objects and categories, learning techniques Tomas Izo Textbooks Requirements / Grading • Problem sets (50%) • Recommended book: • Midterm exam (20%) – Computer Vision: Algorithms and Applications • Final exam (20%) – By Rick Szeliski • Class participation, including attendance (10%) – http://szeliski.org/Book/ – A quote from a student evaluation: • And others on reserve at PCL “To be honest, I think without going to class, the course would be very hard. “ Problem sets Matlab • Some short answer concept questions • Built-in toolboxes for low- • Programming problem level image processing, – Implementation visualization – Explanation, results • Compact programs C t • Code in Matlab – available on CS Unix machines (see course page) • Intuitive interactive debugging • These assignments are substantial. • They will take significant time to do. • Widely used in • Start early. engineering 7
Pset 0 Digital images • Pset 0: Matlab warmup + basic image manipulation Images as matrices • Out Fri Jan 21, Due Fri Jan 28 • Verify CS account and Matlab access • Look at the tutorial online L k t th t t i l li Digital images width 520 j=1 i=1 Color images, Intensity : [0,255] RGB color space 500 height R G B im[176][201] has value 164 im[194][203] has value 37 Preview of some problem sets Preview of some problem sets Grouping Image mosaics / stitching Image from Fei-Fei Li 8
Preview of some problem sets Preview of some problem sets Object search and recognition Tracking, activity recognition Collaboration policy Assignment deadlines • Assignments in by11:59 PM on due date All responses and code must be written – Follow submission instructions given in individually. assignment regarding hardcopy/electronic. – Deadlines are firm. We’ll use turnin timestamp. Students submitting answers or code found to be • 3 free late days total for the term • 3 free late days, total for the term. identical or substantially similar (due to identical or substantially similar (due to inappropriate collaboration) risk failing the • Use as you want, but note that first two course. assignments lighter than rest. • If your program doesn’t work, clean up the code, comment it well, explain what you have, and still submit. Miscellaneous Coming up • Check class website regularly • Now: check out Matlab tutorial online • We’ll use Blackboard to send email • Friday 21st: Pset 0 out • Monday 24 th : first lecture on linear filters • No laptops, phones, etc. open in class please. p p , p , p p • Friday 28 th : Pset 0 due • Use our office hours! 9
Recommend
More recommend