robotics
play

Robotics 13 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 13 1 13 - PowerPoint PPT Presentation

Robotics 13 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 13 1 13 Robotics 13.1 Robots 13.2 Computer vision 13.3 Motion planning 13.4 Controller AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 13 2 Robots Robots are physical agents


  1. Robotics 13 AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 1

  2. 13 Robotics ∗ 13.1 Robots 13.2 Computer vision 13.3 Motion planning 13.4 Controller AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 2

  3. Robots Robots are physical agents that perform tasks by manipulating the physical world Wide application: Industry, Agriculture, Transportation, Health, En- vironments, Exploration, Personal Services, Entertainment, Human augmentation and so on ⇐ Robotic age ⇒ Intelligent Robots AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 3

  4. Types of robots • Manipulators: physically anchored to their workplace e.g., factory assembly line, the International Space Station • Mobile robots: move about their environment – Unmanned ground vehicles (UGVs), e.g., The planetary Rover (in Mars), intelligent vehicles – Unmanned air vehicles (UAVs), i.e., drone – Autonomous underwater vehicles (AUVs) – Autonomous fight unit • Mobile manipulator: combined mobility with manipulation – Humanoid robots: mimic the human torso Other: prosthetic devices (e.g., artificial limbs), intelligent environ- ments (e.g., house equipped with sensors and effectors), multibody systems (swarms of small cooperating robots) AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 4

  5. Hardware A diverse set of robot hardware comes from interdisciplinary tech- nologies – Processors (controllers) – Sensors – Effectors – – Manipulators AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 5

  6. Sensors Passive sensors or active sensors – Range finders: sonar (land, underwater), laser range finder, radar (aircraft), tactile sensors, GPS – Imaging sensors: cameras (visual, infrared) – Proprioceptive sensors: shaft decoders (joints, wheels), inertial sen- sors, force sensors, torque sensors AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 6

  7. Manipulators P R R R R R Configuration of robot specified by 6 numbers ⇒ 6 degrees of freedom (DOF) 6 is the minimum number required to position end-effector arbitrarily. For dynamical systems, add velocity for each DOF. AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 7

  8. Non-holonomic robots θ ( x , y ) A car has more DOF (3) than controls (2), so is non-holonomic; cannot generally transition between two infinitesimally close configu- rations AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 8

  9. Software Pipeline architecture: execute multiple processes in parallel – sensor interface layer – perception layer – planning and control layer – vehicle interface layer AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 9

  10. Example: A robot car SENSOR INTERFACE PERCEPTION PLANNING&CONTROL USER INTERFACE RDDF database Top level control Touch screen UI corridor pause/disable command Wireless E-Stop Laser 1 interface RDDF corridor (smoothed and original) driving mode Laser 2 interface Laser 3 interface road center Road finder Path planner laser map Laser 4 interface map trajectory VEHICLE Laser 5 interface Laser mapper INTERFACE vision map Camera interface Vision mapper Steering control obstacle list Radar interface Radar mapper Touareg interface vehicle state (pose, velocity) vehicle Throttle/brake control state GPS position UKF Pose estimation Power server interface vehicle state (pose, velocity) GPS compass IMU interface velocity limit Surface assessment Wheel velocity Brake/steering emergency stop heart beats Linux processes start/stop health status Process controller Health monitor power on/off data GLOBAL Data logger File system SERVICES Communication requests Communication channels clocks Inter-process communication (IPC) server Time server AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 10

  11. Computer vision scenes behaviors tracking data association HIGH−LEVEL objects VISION object recognition s.f. contour depth map s.f. s.f. stereo motion edges disparity optical flow regions edge matching s.f. detection shading LOW−LEVEL features VISION segmentation filters image sequence Vision requires combining multiple cues and commonsense knowledge AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 11

  12. Visual recognition Computer vision – visual recognition – – image classification ⇐ (deep) learning – – – object detection, segmentation, image captioning etc. ⇐ 2D → 3D Deep learning become an important tool for computer vision e.g., CNNs, such as PixelCNN E.g., ImageNet: large scale visual recognition challenge AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 12

  13. Images AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 13

  14. Images 195 209 221 235 249 251 254 255 250 241 247 248 210 236 249 254 255 254 225 226 212 204 236 211 164 172 180 192 241 251 255 255 255 255 235 190 167 164 171 170 179 189 208 244 254 255 251 234 162 167 166 169 169 170 176 185 196 232 249 254 153 157 160 162 169 170 168 169 171 176 185 218 126 135 143 147 156 157 160 166 167 171 168 170 103 107 118 125 133 145 151 156 158 159 163 164 095 095 097 101 115 124 132 142 117 122 124 161 093 093 093 093 095 099 105 118 125 135 143 119 093 093 093 093 093 093 095 097 101 109 119 132 095 093 093 093 093 093 093 093 093 093 093 119 I ( x, y, t ) is the intensity at ( x, y ) at time t CCD camera ≈ 1,000,000 pixels; human eyes ≈ 240,000,000 pixels i.e., 0.25 terabits/sec AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 14

  15. Image classification CNNs are the state-of-the-art results AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 15

  16. Perception Perception: the process mapping sensor measurements into internal representations of the environment – sensors: noisy – environment: partially observable, unpredictable, dynamic • HMMs/DBNs can represent the transition and sensor models of a partially observable environment • DNNs can recognize vision and various objects – the best internal representation is not known – unsupervised learning to learn sensor and motion models from data AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 16

  17. Perception Stimulus (percept) S , World W S = g ( W ) E.g., g = “graphics”. Can we do vision as inverse graphics? W = g − 1 ( S ) AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 17

  18. Perception Stimulus (percept) S , World W S = g ( W ) E.g., g = “graphics”. Can we do vision as inverse graphics? W = g − 1 ( S ) Problem: massive ambiguity AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 18

  19. Perception Stimulus (percept) S , World W S = g ( W ) E.g., g = “graphics.” Can we do vision as inverse graphics? W = g − 1 ( S ) Problem: massive ambiguity AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 19

  20. Perception Stimulus (percept) S , World W S = g ( W ) E.g., g = “graphics.” Can we do vision as inverse graphics? W = g − 1 ( S ) Problem: massive ambiguity AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 20

  21. Localization Compute current location and orientation (pose) given observations (DBN) A t − 2 A t − 1 A t X t − 1 X t X t + 1 Z t − 1 Z t Z t + 1 • Treat localization as a regression problem • Can be done in DNNs AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 21

  22. Mapping Localization: given map and observed landmarks, update pose distri- bution Mapping: given pose and observed landmarks, update map distribu- tion SLAM (simultaneous localization and mapping): given observed land- marks, update pose and map distribution Probabilistic formulation of SLAM add landmark locations L 1 , . . . , L k to the state vector, proceed as for localization AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 22

  23. Mapping AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 23

  24. Detection Edge detections in image ⇐ discontinuities in scene 1) depth 2) surface orientation 3) reflectance (surface markings) 4) illumination (shadows, etc.) AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 24

  25. Object detection Single object: classification + localization e.g., single-stage object detectors: YOLO/SSD/RetinaNet Multiple objects: each image needs a different number of outputs – apply a CNN to many different crops of the image, CNN clas- sifies each crop as object or background e.g. lots of object detectors 3D object detection: harder than 2D e.g., simple camera model Object detection + Captioning = Dense captioning Objects + relationships = Scene graphs AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 25

  26. Prior knowledge Shape from . . . Assumes motion rigid bodies, continuous motion stereo solid, contiguous, non-repeating bodies texture uniform texture shading uniform reflectance contour minimum curvature AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 26

  27. Object recognition Simple idea – extract 3D shapes from 2D image Problems – extracting curved surfaces from image – improper segmentation, occlusion – unknown illumination, shadows, noise, complexity, etc. Approaches – machine learning (deep learning) methods based on image statis- tics AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 27

  28. Segmentation Segmentation: is all needed – Semantic segmentation: no objects, just pixels label each pixel in the image with a category label; dont differen- tiate instances, only care about pixels e.g., CNN – Instance segmentation + object detection → multiple object AI Slides (6e) c � Lin Zuoquan@PKU 1998-2020 13 28

Recommend


More recommend