human robot interaction
play

Human-Robot Interaction Elective in Artificial Intelligence - PDF document

Human-Robot Interaction Elective in Artificial Intelligence Lecture 7 RGBD Perception Luca Iocchi DIAG, Sapienza University of Rome, Italy With contributions from A. Youssef and M.T. Lazaro Outline RGBD sensors and applications


  1. Human-Robot Interaction Elective in Artificial Intelligence Lecture 7 – RGBD Perception Luca Iocchi DIAG, Sapienza University of Rome, Italy With contributions from A. Youssef and M.T. Lazaro Outline • RGBD sensors and applications • Detectors for people detection • RGB processing / OpenCV • Example 1: RBG face/body detection • ROS • Depth processing • Example 2: RGBD face detections / depth segmentation / virtual buttons • Conclusions 2

  2. Depth cameras � Color (RGB) + Depth (D) information � Improve efficiency and robustness of image processing � Mostly used in video-games, but useful also in HCI and HRI 3 Stereo vision

  3. Stereo Triangulation Z = b f / (u L – u R ) 5 Active RGBD cameras � Capture color and depth � Active infra-red light pattern � Work with poor/no texture � Depth computation Stereo triangulation � Time of flight � � Indoor (dark) environments 6

  4. Active infra-red pattern http://wiki.ros.org/kinect_calibration/technical 7 RGBD Vision Kinect and RGBD Images: Challenges and Applications. Luiz Velho. IMPA 8

  5. RGBD Vision http://robotics.ait.kyushu-u.ac.jp/~kurazume/ 9 Point Clouds http://pointclouds.org/documentation/tutorials/ground_based_rgbd_people_detection.php 10

  6. Depth resolution Krystof Litomisky Consumer RGB-D Cameras and their Applications 11 Application: 3 D mapping Krystof Litomisky Consumer RGB-D Cameras and their Applications 12

  7. Application: skeleton tracking SkeletalViewer, Microsft Kinect SDK 13 Application: augmented reality SkeletalViewer, Microsft Kinect SDK 14

  8. RGBD Image processing 15 Efficiency and robustness Image Processing Depth filtering segmentation OUTPUT Depth Select region Remove false of interest positives 16

  9. Software Libraries Linux Image Your acquisition app C++ ROS + Drivers + Libraries (OpenCV) 17 Installation ROS (includes OpenCV) www.ros.org OpenNI2 https://github.com/occipital/openni2 thin_drivers https://bitbucket.org/ggrisetti/thin_drivers Note: Complete image for Raspberry PI 3 available! 18

  10. Introduction to OpenCV OpenCV (Open Source Computer Vision) is a library of ● programming functions for realtime computer vision. BSD Licensed ● free for commercial use ● C++, C, Python and Java (Android) interfaces ● Supports Windows, Linux, Android, iOS and Mac OS ● More than 2500 optimized algorithms ● http://opencv.org/ 19 L. Iocchi - Human-Robot Interaction Introduction to OpenCV Modules for Image Processing core - a compact module defining basic data structures, including ● the dense multi-dimensional array Mat and basic functions used by all other modules. imgproc - an image processing module that includes linear and ● non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on. features2d - salient feature detectors, descriptors, and descriptor ● matchers. highgui - an easy-to-use interface to video capturing, image and ● video codecs, as well as simple UI capabilities. 20 L. Iocchi - Human-Robot Interaction

  11. Introduction to OpenCV How to include modules #include <opencv2/core/core.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <opencv2/features2d/features2d.hpp> Data types Set of primitive data types the library can operate on uchar : 8-bit unsigned integer schar : 8-bit signed integer ● ushort : 16-bit unsigned integer short : 16-bit signed integer ● int : 32-bit signed integer float : 32-bit floating-point number ● double : 64-bit floating-point number ● 21 L. Iocchi - Human-Robot Interaction Introduction to OpenCV Image representation in OpenCV: cv::Mat: is a n-dimensional array http://docs.opencv.org/modules/core/doc/basic_structures.html#mat gray scale VS color image 22 L. Iocchi - Human-Robot Interaction

  12. Introduction to OpenCV Mat (Header vs Data) Mat A, C; // creates just the header parts A = imread(argv[1], CV_LOAD_IMAGE_COLOR); // here we'll know the method used (allocate matrix) Mat B(A); // Use the copy constructor (copy by reference) C = A; // Assignment operator (copy by reference) Mat D = A.clone(); // creates a new matrix D with data copied from A Mat E; // creates the header for E with no data A.copyTo(E); //sets the data for E (copied from A) 23 L. Iocchi - Human-Robot Interaction Introduction to OpenCV MATLAB style initializer Mat E = Mat::eye(4, 4, CV_64F); cout << "E = " << endl << " " << E << endl << endl; Mat Z = Mat::zeros(3,3, CV_8UC1); cout << "Z = " << endl << " " << Z << endl << endl; Mat O = Mat::ones(2, 2, CV_32F); cout << "O = " << endl << " " << O << endl << endl; 24 L. Iocchi - Human-Robot Interaction

  13. Introduction to OpenCV How to scan gray scale images cv::Mat I = ... ... for( int i = 0; i < I.rows; ++i) { for( int j = 0; j < I.cols; ++j) { uchar g = I.at<uchar>(i,j); ... } } How to scan RGB images cv::Mat I = ... for( int i = 0; i < I.rows; ++i) { for( int j = 0; j < I.cols; ++j) { uchar blue = I.at<cv::Vec3b>(i,j)[0]; uchar green = I.at<cv::Vec3b>(i,j)[1]; uchar red = I.at<cv::Vec3b>(i,j)[2]; } } 25 L. Iocchi - Human-Robot Interaction Full body detection in images Histogram of Oriented Gradient (HOG) ● It was introduced by Navneed Dalal and Bill Triggs in 2005 [1] ● Sliding window technique for people detection in image. ● Shape and appearance presence. 2 ● HOG is a features descriptor: 6 ➢ Dense feature extraction. ➢ Local overlapping. ➢ Trained classifier (support Vector Machine SVM) L. Iocchi - Human-Robot Interaction

  14. Full body detection in images Histogram of Oriented Gradient (HOG) ● Gradient computation ● Orientation binning ● Descriptor block ● Normalization C++ void HOGDescriptor::detectMultiScale(const Mat& img, vector<Rect>& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size(), double scale 0=1.05, int group_threshold=2) L. Iocchi - Human-Robot Interaction Full body detection in images HOG in OpenCV #include <opencv2/objdetect/objdetect.hpp> HOGDescriptor hog; // standard descriptor hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector()); vector<Rect> found; // where to save the detected persons hog.detectMultiScale(img, found, 0, Size(8,8), Size(32,32), 1.05, 2); http://mccormickml.com/2013/05/09/hog-person-detector-tutorial/ 28 L. Iocchi - Human-Robot Interaction

  15. Face detection in images Viola-Jones implementation of the OpenCV library (by using Haar-like cascades) OpenCV comes with a trainer as well as a detector OpenCV already contains many pre-trained classifiers for face, eyes, smile etc. Cascade Classifier in OpenCV C++ void CascadeClassifier::detectMultiScale (const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size() ) 29 L. Iocchi - Human-Robot Interaction Face detection in images Cascade Classifier in OpenCV #include <opencv2/objdetect/objdetect.hpp> String face_cascade_trained ="haarcascade_frontalface_alt.xml"; CascadeClassifier face_cascade; face_cascade.load( face_cascade_trained ); vector<Rect> faces; face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) ); 30 L. Iocchi - Human-Robot Interaction

  16. Image-to-world Conversion u = � � , � � : homogeneous vector of pixel in image coordinates. P : perspective projection matrix. M = � � , � � , � � : homogeneous vector of real world coordinates. � � � ⋅ � � � � ��� : � � , 0, � � , � � , � � , 0,0,1 Camera model: /camera/depth/camera_info /camera/rgb/camera_info 31 L. Iocchi - Human-Robot Interaction Image-to-world Conversion Using the depth camera intrinsics, each pixel (x_d,y_d) of the depth camera can be projected to metric 3D space using the following formula: P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d P3D.z = depth(x_d,y_d) with fx_d, fy_d, cx_d and cy_d the intrinsics of the depth camera. We can then re-project each 3D point on the color image and get its color: P3D' = R.P3D + T P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb with R and T the rotation and translation parameters estimated during the stereo calibration. 32 L. Iocchi - Human-Robot Interaction

  17. Robot Operating System - ROS • ROS is a middleware for efficient data exchange • Based on publish/subscribe and event-based paradigms • Topics identified by name and type of message • Each node can publish topics and subscribe to topics • All nodes subscribed to a topic are notified when a node publishes data on such topic www.ros.org 33 ROS publish/subscribe In our examples, thin_drivers nodes publish data (i.e., images) on topics whenever they are captured from the devices. Application nodes subscribe to these topics and are notified when images are ready. Application nodes are implemented as callback functions activated upon arrival of data in a topic. 34

Recommend


More recommend