Today • Course overview • Requirements, logistics Computer Vision • Image formation Thursday, August 30 Introductions Introductions • Instructor : Prof. Kristen Grauman grauman @ cs TAY 4.118, Thurs 2-4 pm • TA : Sudheendra Vijayanarasimhan svnaras @ cs ENS 31 NQ, Mon/Wed 1-2 pm • Class page : Check for updates to schedule, assignments, etc. http://www.cs.utexas.edu/~grauman/courses/378/main.htm Computer vision Why vision? • Automatic understanding of images and • As image sources multiply, so do video applications – Relieve humans of boring, easy tasks • Computing properties of the 3D world from visual data – Enhance human abilities – Advance human-computer interaction, • Algorithms and representations to allow a visualization machine to recognize objects, people, – Perception for robotics / autonomous agents scenes, and activities. • Possible insights into human vision 1
Some applications Some applications Navigation, driver safety Assistive technology Autonomous robots Surveillance Factory – inspection Monitoring for safety (Cognex) (Poseidon) Visualization License plate reading Visualization Medical and tracking Visual effects imaging (the Matrix) Some applications Why is vision difficult? • Ill-posed problem: real world much more complex than what we can measure in images Situated search Multi-modal interfaces – 3D � 2D • Impossible to literally “invert” image formation process Image and video databases - CBIR Tracking, activity recognition Challenges: context and Challenges: robustness human experience Illumination Object pose Clutter Intra-class Occlusions Viewpoint Function Dynamics Context cues appearance 2
Challenges: complexity Why is vision difficult? • Thousands to millions of pixels in an image • 3,000-30,000 human recognizable object categories • Ill-posed problem: real world much more • 30+ degrees of freedom in the pose of articulated complex than what we can measure in objects (humans) images • Billions of images indexed by Google Image Search – 3D � 2D • 18 billion+ prints produced from digital camera images • Not possible to “invert” image formation in 2004 process • 295.5 million camera phones sold in 2005 • Generally requires assumptions, • About half of the cerebral cortex in primates is constraints; exploitation of domain- devoted to processing visual information [Felleman specific knowledge and van Essen 1991] Vision and graphics Related disciplines Artificial Images Vision Model intelligence Pattern Geometry, recognition physics Graphics Computer vision Image Cognitive processing science Inverse problems: analysis and synthesis. Algorithms Research problems vs. application areas Goals of this course • Feature detection • Industrial inspection and quality control • Contour representation • Introduction to primary topics • Reverse engineering • Segmentation • Surveillance and security • Hands-on experience with algorithms • Stereo vision • Face, gesture recognition • Shape modeling • Views of vision as a research area • Road monitoring • Color vision • Autonomous vehicles • Motion analysis • Military applications • Invariants • Medical image analysis • Uncalibrated, self- calibrating systems • Image databases • Object detection • Virtual reality • Object recognition List from [Trucco & Verri 1998] 3
Topics overview We will not cover (extensively) • Image formation, cameras • Image processing • Color • Human visual system • Features • Particular machine vision systems or applications • Grouping • Multiple views • Recognition and learning • Motion and tracking Features and filters Image formation • Inverse process of vision: how does light in 3d world project to form 2d images? Transforming and describing images; textures and colors Grouping Multiple views Lowe [fig from Shi et al] Clustering, Hartley and Zisserman segmentation, Multi-view geometry and fitting; what parts matching, stereo belong together? Tomasi and Kanade 4
Recognition and learning Motion and tracking Tracking objects, video analysis, low level motion Shape matching, recognizing objects and categories, learning techniques Tomas Izo Requirements • Biweekly (approx) problem sets – Concept questions – Implementation problems • Two exams, midterm and final • Current events (optional) In addition, for graduate students: • Research paper summary and review • Implementation extension Grading policy Due dates Final grade breakdown: • Assignments due before class starts on due date • Problem sets (50%) • Lose half of possible remaining credit • Midterm quiz (15%) each day late • Final exam (20%) • Three free late days, total • Class participation (15%) 5
Collaboration policy Current events (optional) You are welcome to discuss problem sets, • Any vision-related piece of news; may but all responses and code must be revolve around policy, editorial, written individually. technology, new product, … • Brief overview to the class Students submitting solutions found to be • Must be current identical or substantially similar (due to • No ads inappropriate collaboration) risk failing the course. • Email relevant links or information to TA Miscellaneous Paper review guidelines • Check class website • Thorough summary in your own words • Make sure you get on class mailing list • Main contribution • No laptops in class please • Strengths? Weaknesses? • How convincing are the experiments? Suggestions to improve them? • Feedback welcome and useful • Extensions? • 4 pages max • May require reading additional references Image formation • How are objects in the world captured in an image? 6
Physical parameters of Radiometry image formation • Images formed • Photometric depend on – Type, direction, intensity of light reaching sensor amount of light – Surfaces’ reflectance properties from light sources • Optical and surface – Sensor’s lens type reflectance – focal length, field of view, aperture properties (See • Geometric F&P Ch 4) – Type of projection – Camera pose – Perspective distortions Light source direction Surface reflectance properties Specular [fig from Fleming, Torralba, & Adelson, 2004] Lambertian Image credit: Don Deering Perspective projection Camera obscura • Pinhole camera: simple model to approximate imaging process In Latin, means ‘dark room’ " Reinerus Gemma-Frisius , observed an eclipse of the sun at Louvain on January 24, 1544, and later he used this illustration of the event in his book De Radio If we treat pinhole as a point, only one ray Astronomica et Geometrica, 1545. It is thought to be the first published illustration of a camera obscura..." from any given point can enter the camera Hammond, John H., The Camera Obscura, A Chronicle Forsyth and Ponce http://www.acmi.net.au/AIC/CAMERA_OBSCURA.html 7
Camera obscura Perspective effects • Far away objects appear smaller Jetty at Margate England, 1898. An attraction in the late 19 th century Around 1870s http://brightbytes.com/cosite/collection2.html Adapted from R. Duraiswami Forsyth and Ponce Perspective effects Perspective projection equations • 3d world mapped to 2d projection • Parallel lines in the scene intersect in the image Image plane Focal length Optical Camera axis frame Board Forsyth and Ponce Forsyth and Ponce Perspective projection equations Projection properties Image • Many-to-one: any points along same ray plane Focal map to same point in image length • Points � points Optical Camera • Lines � lines (collinearity preserved) axis frame • Distances and angles are not preserved • Degenerate cases: Non-linear – Line through focal point projects to a point. – Plane through focal point projects to line Scene point Image coordinates – Plane perpendicular to image plane projects to part of the image. Forsyth and Ponce 8
Perspective and art Weak perspective • Use of correct perspective projection indicated in • Approximation: treat magnification as constant 1 st century B.C. frescoes • Assumes scene depth << average distance to • Skill resurfaces in Renaissance: artists develop camera systematic methods to determine perspective • Makes perspective equations linear projection (around 1480-1515) Image World plane points: Raphael Durer, 1525 Orthographic projection • Given camera at constant distance from scene • World points projected along rays parallel to optical access • Limit of perspective projection as Planar pinhole Orthographic perspective projection From M. Pollefeys Which projection model? • Weak perspective: – Accurate for small, distant objects; recognition – Linear projection equations - simplifies math • Pinhole perspective: – More accurate but more complex – Structure from motion 9
Recommend
More recommend