concepts algorithms practical
play

CONCEPTS, ALGORITHMS & PRACTICAL APPLICATIONS IN 2D AND 3D - PowerPoint PPT Presentation

CONCEPTS, ALGORITHMS & PRACTICAL APPLICATIONS IN 2D AND 3D COMPUTER VISION Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia Simon, Daniel Steininger, Markus Hofsttter und Andreas Kriechbaum Senior


  1. CONCEPTS, ALGORITHMS & PRACTICAL APPLICATIONS IN 2D AND 3D COMPUTER VISION Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia Simon, Daniel Steininger, Markus Hofstätter und Andreas Kriechbaum Senior Scientist Center for Vision, Automation & Control Autonomous Systems AIT Austrian Institute of Technology GmbH Vienna, Austria

  2. MOTIVATION ▪ Research is evolution → so is your learning process RECOGNITION SEGMENTATION RECONSTRUCTION GRAND GRAND TIME TIME CHALLENGES CHALLENGES ▪ Balance: becoming a domain expert vs. being a „ globalist “ ▪ Researchers tend to favour certain paradigms - Learn to outline trends, look upstream ▪ Revisit old problems to see them under new light ▪ Specialize the general & Generalize the specific 2 ▪ Factorize your know-how (code, topics , …) into components → sustainable, scalable

  3. VISUAL OBJECT RECOGNITION TRENDS Human-level performance Accuracy time 2012 2019 Computational costs DEDICATED COMPUTATIONAL COMP.HW. /for real-time/ BARRIER CPU time 2012 2019 Amount of image data (for training) IMAGE DATA BARRIER time 2012 2019

  4. AIT AUSTRIAN INSTITUTE OF TECHNOLOGY Federal Ministry for Transport, Federation of Austrian Innovation and Technology Industries 50,46% 49,54% Nuclear AIT Austrian Institute of Technology Seibersdorf Engineering AIT Austrian Institute of Technology Labor GmbH Seibersdorf GmbH Health & Digital Safety & Vision, Automation & 1300+ employees Energy Bioresources Security Control Budget: 140 Mio € Business Model: 40:30:30 Low-Emission Technology Innovation Systems & Mobility Systems Transport Experience Policy 4

  5. VISION, AUTOMATION & CONTROL High-Performance 3D Vision and Complex Dynamical Vision Modeling Systems Robust and flexible Worldwide fastest vision Advanced handling and 3D vision technology sensor technology smart production F r o m S e n s o r T o D e c i s i o n 5

  6. AIT AUTONOMOUS & ASSISTIVE SYSTEMS Driver Assistance System for Trams Assistance Systems for Construction Machines Autonomous Autonomous Bus Local Railway Driverless Missions in Crisis & Disaster Management

  7. ENABLING METHODOLOGIES FOR ASSISTED OR SELF-DRIVING Mobile platforms: sensory signals + local context (situation) → decisions vehicle control vehicle model motion analysis dynamic model computation object tracking localization vehicle control state prediction safety positioning recognition probability compliance mapping based behavior elements objects (type, location, pose) sensor fusion environment ego-motion computation sensor/data fusion Deep learning based Localization, Sparse motion RELATED detection & segmentation map building estimation, tracking KNOW-HOW Vision algorithms testing 7

  8. INTELLIGENT PERCEPTION FOR MOBILE MACHINES

  9. AUTONOMOUS OFFROAD VEHICLE

  10. Introduction A frequently asked question 14.07.2019 10

  11. Example for robust vision Example: Crop detection ▪ Radial symmetry ▪ Near regular structure 11

  12. Introduction Motivation ▪ Challenges when developing Vision Systems: ▪ Complexity  Algorithmic, Systemic, Data ▪ Non-linear search for a solution RESEARCH DEVELOPMENT MATLAB C++ Alg. A branch & bound research methodology Alg. B Alg. C IDEA APPLICATION PRODUCT

  13. 2D Real-time optical flow based particle advection for object detection and tracking 13

  14. MOTIVATION – I. OBJECT DETECTION PIPELINES Spatial distribution of Delineated objects posterior probability Score map (DPM, R- CNN, …) Bounding boxes Vote map Occupancy map More complex parametric back-projected similarity map … representations Instance segmentation DPM: Deformable Part Models R-CNN: Region-based Convolutional Neural Networks 14 14.07.2019

  15. RELATED STATE-OF-THE-ART ▪ Clustering detections Center-surround filter weakly constrained structural prior Non-maximum suppression Rothe et al., 2014 Neubeck & Van Gool, 2006 Mean Shift, CAMShift Comaniciu & Meer, 2002 Bradski 1998 MeanShift and CamShift iterations ▪ Detection by voting/segmentation/learning implicit or explicit structural prior Implicit Shape Model Leibe et al. 2005 Leibe et al. 2005 Markov Point Processes for object configurations Verdie, 2014 Structured random forests Dollar & Zitnick 2013 Kontschieder et al. 2011 Kontschieder et al. 2011 CNN‘s for Non-Max. Suppression Hosang et al. 2016, Wan et al. 2015 Hosang et al. 2016

  16. Optical flow driven advection Advection : transport mechanism induced by a force field t i t i+1 Dense optical flow field V y,i A particle trajectory V x,i induced by the OF field 16

  17. Particle advection with FW-BW consistency ▪ A simple but powerful test Forward: Successful  Failure Backward:  <   x  x : mean offset Consistency check:

  18. Pedestrian Flow Analysis Public dataset: Grand Central Station, NYC: 720x480 pixels, 2000 particles, runs at 35 fps

  19. SHAPE-GUIDED TRACKLET GROUPING FOR COMPACT OBJECTS Optical flow driven particle tracklets Clustering directly performed in the discrete tracklet-domain The i th tracklet: 𝑈 𝑗 = 𝑦 𝑢 , 𝑧 𝑢 𝑢=1..𝑂 , 𝒘, 𝑥 𝒘 – velocity vector 𝑥 – weight (scalar) STEP 1: sampling STEP 2: weight generation STEP 4: find nearest STEP 3: local shape, scale from orientation similarity tracklet to mode estimate and center estimation (w.r.t. center tracklet) W Estimated cluster parameters + mode location repeat from STEP 1 until Single parameter: convergence W – initial scale

  20. 3D STEREO DEPTH INFORMATION CHARACTERISTICS AND USAGE 20

  21. PASSIVE STEREO BASED DEPTH MEASUREMENT ▪ 3D stereo-camera system developed by AIT ▪ Area-based, local-optimizing, correlation- based stereo matching algorithm ▪ Specialized variant of the Census Transform ▪ Resolution: typically ~1 Mpixel ▪ Run-time: ~ 14 fps (Core-i7, multithreaded, SSE-optimized) ▪ Excellent “depth -quality-vs.-computational- costs” ratio ▪ USB 2 interface Advantage: • Depth ordering of people • Robustness against illumination, shadows, • Enables scene analysis

  22. STEREO CAMERA CHARACTERISTICS Trinocular setup: ▪ 3 baselines possible ▪ 3 stereo computations with results fused into one disparity image far-range near-range small medium 22 large baseline

  23. Data characteristics Disparity image Intensity image y Planar surface in 3D space y ( x,y ) image coordinates, d disparity d(x,y) 23 d

  24. 2.5D vs. 3D algorithmic approaches Computed top view of the 3D point cloud 3D approach Height (world) Stereo setup 2.5D approach noisy measurement correct measurement Ground plane (world)

  25. LEFT ITEM DETECTION Additional knowledge (compared to existing video analytics solutions): • Stationary object (Geometry introduced to a scene) • Object geometric properties (Volume, Size) • Spatial location (on the ground)

  26. METHODOLOGY INTENSITY Change detection Ortho-transform Background model Object detection and validation in the ortho- Ground plane Ortho-map map estimation generation Stereo disparity Combination of proposals DEPTH + Final Input images Processing intensity and depth data candidates Validation

  27. Left Item Detection – Demos 14.07.2019 27

  28. Clustering in discrete two-dimensional distributions 28

  29. Object detection as clustering 14.07.2019 29 (a)

  30. A Frequently Occurring Task Analysis of discrete two-dimensional distributions … … LEARNED CODEBOOK

  31. EXAMPLES ▪ Description of the Binary-Shape-driven 2D clustering • Shape learning • Shape clustering, delineation ▪ Results • Occupancy map clustering • Text line delineation • Object delineation by shape-guided tracklet grouping 14.07.2019 31

  32. TASK DEFINITION Intermediate probabilistic representations 2D distributions generate consistent Local grouping object hypotheses prior, structure-specific knowledge Challenge: ▪ arbitrarily shaped distributions ▪ multiple nearby modes ▪ noise, clutter Definitions: mode = location of maximum density computed using a kernel K density estimation of variable x 𝑔 𝑦 = ෍ 𝐿 𝑏 − 𝑦 𝑥(𝑏) 𝑏 32 32

  33. Shape learning Shape learning – Case: Compact clusters 1. Binary mask from manual annotation or Spatial resolution of local structure from synthetic data 2. Sampling using an analysis window discretized into a n i × n i grid 3. Building a codebook of binary shapes with a coarse-to-fine spatial resolution Mode-centered samples M Off-the-mode samples Codebook:

  34. Example Codebook – Case: Compact clusters 14.07.2019 34

  35. Shape learning Shape learning – Case: Line structures Spatial resolution of local structure Binary mask from manually annotated high low mid text lines Codebook:

  36. Shape delineation Shape delineation – I. Step 1: Fast Mode Seeking Three integral images: and Mode location: COMPACT CLUSTERS LINE STRUCTURES Step 2: Local density analysis Density measure for each resolution level for the binary structure Enumerating all binary shapes at each resolution level → Finding best matching entry:

Recommend


More recommend