6.869 projects 6.869 projects, continued Projects due Thursday, May 12 (3 weeks from today). The write-up should have an introduction, where you explain why the reader Projects due Thursday, May 12 (3 weeks from today). The write-up should have an introduction, where you explain why the reader should be interested in the problem, and frame the problem in context. should be interested in the problem, and frame the problem in context. On that day, you’ll give us a 5 minute, informal presentation about your On that day, you’ll give us a 5 minute, informal presentation about your project. This is to have fun, to see what other people did, and to do something For a presentation and papers on writing conference papers, see the Weds, project. This is to have fun, to see what other people did, and to do something For a presentation and papers on writing conference papers, see the Weds, different on the last day of class (we’ll have refreshments). It will also help different on the last day of class (we’ll have refreshments). It will also help April 10, 2002 lecture and readings on this course web page: April 10, 2002 lecture and readings on this course web page: me and Xiaoxu see on overview of your project before we read your write-up. me and Xiaoxu see on overview of your project before we read your write-up. http://www.ai.mit.edu/courses/6.899/doneClasses.html http://www.ai.mit.edu/courses/6.899/doneClasses.html The write-up of the project is the main thing. It should be about the length The write-up of the project is the main thing. It should be about the length and style of a conference paper submission: about 6 to 8 double-column, and style of a conference paper submission: about 6 to 8 double-column, single-spaced pages. single-spaced pages. Today: Cameras looking at, and tracking, Next week: a field trip to a guest lecture people Prof. Dan Huttenlocher Huttenlocher, from Cornell , from Cornell Prof. Dan Prof. Dan Huttenlocher, from Cornell A mini-application lecture: under controlled conditions (not general conditions), what human interaction applications can you build with the tools we’ve developed so far? To be compared with: more sophisticated detection, Graphical Models for Object Recognition Graphical Models for Object Recognition Graphical Models for Object Recognition classification methods that we’ve studied, and the tracking tools that we’ll study next. Kiva Kiva 32 Kiva 32-G449, Tuesday, April 26, 2005, 3-4pm, refreshments at 32- -G449, Tuesday, April 26, 2005, 3 G449, Tuesday, April 26, 2005, 3- -4pm, refreshments at 4pm, refreshments at 2:45. I’ 2:45. I ’ll come down here at 2:30 to remind anyone who forgets ll come down here at 2:30 to remind anyone who forgets 2:45. I’ll come down here at 2:30 to remind anyone who forgets MIT 6.869 MIT 6.869 MIT 6.869 the one the one- the one-time shift in class location. -time shift in class location. time shift in class location. April 21, 2005 April 21, 2005 April 21, 2005 Yesterday’s tomorrow Computer vision still needs to become more robust Elektro Elektro Sparko Sparko New York Worlds Fair, 1939 New York Worlds Fair, 1939 (Westinghouse Historical Collection) Pavlovic, Rehg, Cham, and Murphy, Intl. Conf. Computer Vision, 1999 (Westinghouse Historical Collection) Page 1
Research at MERL on fast, But we can fake it with low-cost vision systems clever system design From MERL and Mitsubishi Electric: From MERL and Mitsubishi Electric: From MERL and Mitsubishi Electric: David Anderson, Paul Beardsley, David Anderson, Paul Beardsley, David Anderson, Paul Beardsley, Chris Dodge, William Freeman, Hiroshi Chris Dodge, William Freeman, Hiroshi Chris Dodge, William Freeman, Hiroshi M. Krueger, Kage Kage, Kazuo Kage, Kazuo Kyuma, Darren Leigh, Neal , Kazuo Kyuma Kyuma, Darren Leigh, Neal , Darren Leigh, Neal “Artificial Reality”, Addison-Wesley, 1983. McKenzie, Yasunari McKenzie, McKenzie, Yasunari Miyake, Michal Roth, Yasunari Miyake, Miyake, Michal Michal Roth, Roth, Ken Ken- Ken-ichi Tanaka, Craig Weissman, -ichi ichi Tanaka, Craig Tanaka, Craig Weissman Weissman, , William William Yerazunis William Yerazunis Yerazunis Existing interfaces devices Computer vision based interface are fast & low-cost. The hope: video input will give a more The hope: video input will give a more expressive, natural or engaging interface. expressive, natural or engaging interface. Applications make the vision easier. There is a human in the loop. Constraints simplify recognition-- � Rich, immediate visual, audio feedback. if you know where the tracks are, it’s easy to guess where the train is. � The player can correct for algorithm imperfections. Page 2
Computer vision algorithms Computer vision algorithms as ocean-going vessels as ocean-going vessels this work television market 1. Selected appliance: television ~1 billion television sets ~1 billion television sets Survey Survey results “ “What high technology gadget has improved the What high technology gadget has improved the “What high technology gadget has improved What high technology gadget has improved “ quality of your life the most? quality of your life the most?” ” the quality of your life the most?” ” the quality of your life the most? Microwave ovens and TV remote controls Microwave ovens and TV remote controls --Porter/ -- Porter/Novelli Novelli survey, 1995 survey, 1995 What two things were mentioned most? What two things were mentioned most? message: message: People value the ability to control a television People value the ability to control a television from a distance. from a distance. Page 3
Control of television set from a distance Design constraints Wired remote control. Wired remote control. � From the user From the user’ ’s point of view s point of view � Infra Infra- -red remote control. red remote control. � From the computer From the computer’ ’s point of view s point of view � Voice control. Voice control. Gesture control. Gesture control. From the user’s point of view: From the computer’s point of view: Complex commands Living room scene is difficult require complicated gestures? How can the computer find the hand, and recognize its gesture, in this complicated, unpredictable visual scene? “mute mute” ” “ Our solution: exploit the visual hand recognition method: feedback from the television template matching Examine the squared difference between (a) pixel values in the hand template, and (b) pixel values in a square centered at each possible position Volume in the image. template image television user Page 4
hand recognition method: normalized correlation Normalized correlation r r ⋅ a b Where a and b are vectors from rasterized ) ( ) r r r r ( patches of the image and ⋅ ⋅ template a a b b template image normalized correlation Background removal Processing block diagram Raw Video (RBG - 24 bit) (1- α ) Image Representation α running Remove Background next average Template Edit average Creation Kalman Filter Correlation Position Trigger Gesture Tracking On-screen Controls Remote Control current TV background removed image Prototype of television controlled by hand signals. TV screen overlay Page 5
TV control Video Prototype limitations Product hardware requirements � Distance from camera: Distance from camera: � Short term Short term 6 - 6 - 10 feet. 10 feet. • • camera camera � Field of view: Field of view: � • • video digitizer video digitizer o tracking: 25 o • • computer computer trigger gesture: 15 o trigger gesture: 15 tracking: 25 o � Coupling to television is loose. � Coupling to television is loose. Long term Long term � Two screens instead of one. � Two screens instead of one. • • TV TV’ ’s / computers / browsers will have cameras s / computers / browsers will have cameras � Robustness during operation: Robustness during operation: � and powerful computers. and powerful computers. no template adaptation to different users. no template adaptation to different users. • • a software product. a software product. background removal may need variable contrast control. background removal may need variable contrast control. recognition system 2. Simple gesture recognition Real-time hand gesture recognition method by orientation histograms training set image signature vector compare T Page 6
Orientation measurements (bottom) are more Orientation measurements (bottom) are more robust to lighting changes than are pixel intensities robust to lighting changes than are pixel intensities (top) (top) Images, orientation images, and orientation histograms for training set Test image, and distances from each of the training set orientation histograms (categorized correctly). Crane movements controlled by hand gestures Page 7
video Janken game 3. Computer vision for computer games. Games add fun and purpose: Games add fun and purpose: “ “Get the sprite Get the sprite through the golden rings. through the golden rings.” ” Field test results from Disney’s VR Aladdin. Games selected for vision interface “Guests cared “ Guests cared about the about the experience, experience, not the not the technology.” ” technology. Page 8
Recommend
More recommend