CONCEPTS, ALGORITHMS & PRACTICAL APPLICATIONS IN 2D AND 3D COMPUTER VISION Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia Simon, Daniel Steininger, Markus Hofstätter und Andreas Kriechbaum Senior Scientist Center for Vision, Automation & Control Autonomous Systems AIT Austrian Institute of Technology GmbH Vienna, Austria
MOTIVATION ▪ Research is evolution → so is your learning process RECOGNITION SEGMENTATION RECONSTRUCTION GRAND GRAND TIME TIME CHALLENGES CHALLENGES ▪ Balance: becoming a domain expert vs. being a „ globalist “ ▪ Researchers tend to favour certain paradigms - Learn to outline trends, look upstream ▪ Revisit old problems to see them under new light ▪ Specialize the general & Generalize the specific 2 ▪ Factorize your know-how (code, topics , …) into components → sustainable, scalable
VISUAL OBJECT RECOGNITION TRENDS Human-level performance Accuracy time 2012 2019 Computational costs DEDICATED COMPUTATIONAL COMP.HW. /for real-time/ BARRIER CPU time 2012 2019 Amount of image data (for training) IMAGE DATA BARRIER time 2012 2019
AIT AUSTRIAN INSTITUTE OF TECHNOLOGY Federal Ministry for Transport, Federation of Austrian Innovation and Technology Industries 50,46% 49,54% Nuclear AIT Austrian Institute of Technology Seibersdorf Engineering AIT Austrian Institute of Technology Labor GmbH Seibersdorf GmbH Health & Digital Safety & Vision, Automation & 1300+ employees Energy Bioresources Security Control Budget: 140 Mio € Business Model: 40:30:30 Low-Emission Technology Innovation Systems & Mobility Systems Transport Experience Policy 4
VISION, AUTOMATION & CONTROL High-Performance 3D Vision and Complex Dynamical Vision Modeling Systems Robust and flexible Worldwide fastest vision Advanced handling and 3D vision technology sensor technology smart production F r o m S e n s o r T o D e c i s i o n 5
AIT AUTONOMOUS & ASSISTIVE SYSTEMS Driver Assistance System for Trams Assistance Systems for Construction Machines Autonomous Autonomous Bus Local Railway Driverless Missions in Crisis & Disaster Management
ENABLING METHODOLOGIES FOR ASSISTED OR SELF-DRIVING Mobile platforms: sensory signals + local context (situation) → decisions vehicle control vehicle model motion analysis dynamic model computation object tracking localization vehicle control state prediction safety positioning recognition probability compliance mapping based behavior elements objects (type, location, pose) sensor fusion environment ego-motion computation sensor/data fusion Deep learning based Localization, Sparse motion RELATED detection & segmentation map building estimation, tracking KNOW-HOW Vision algorithms testing 7
INTELLIGENT PERCEPTION FOR MOBILE MACHINES
AUTONOMOUS OFFROAD VEHICLE
Introduction A frequently asked question 14.07.2019 10
Example for robust vision Example: Crop detection ▪ Radial symmetry ▪ Near regular structure 11
Introduction Motivation ▪ Challenges when developing Vision Systems: ▪ Complexity Algorithmic, Systemic, Data ▪ Non-linear search for a solution RESEARCH DEVELOPMENT MATLAB C++ Alg. A branch & bound research methodology Alg. B Alg. C IDEA APPLICATION PRODUCT
2D Real-time optical flow based particle advection for object detection and tracking 13
MOTIVATION – I. OBJECT DETECTION PIPELINES Spatial distribution of Delineated objects posterior probability Score map (DPM, R- CNN, …) Bounding boxes Vote map Occupancy map More complex parametric back-projected similarity map … representations Instance segmentation DPM: Deformable Part Models R-CNN: Region-based Convolutional Neural Networks 14 14.07.2019
RELATED STATE-OF-THE-ART ▪ Clustering detections Center-surround filter weakly constrained structural prior Non-maximum suppression Rothe et al., 2014 Neubeck & Van Gool, 2006 Mean Shift, CAMShift Comaniciu & Meer, 2002 Bradski 1998 MeanShift and CamShift iterations ▪ Detection by voting/segmentation/learning implicit or explicit structural prior Implicit Shape Model Leibe et al. 2005 Leibe et al. 2005 Markov Point Processes for object configurations Verdie, 2014 Structured random forests Dollar & Zitnick 2013 Kontschieder et al. 2011 Kontschieder et al. 2011 CNN‘s for Non-Max. Suppression Hosang et al. 2016, Wan et al. 2015 Hosang et al. 2016
Optical flow driven advection Advection : transport mechanism induced by a force field t i t i+1 Dense optical flow field V y,i A particle trajectory V x,i induced by the OF field 16
Particle advection with FW-BW consistency ▪ A simple but powerful test Forward: Successful Failure Backward: < x x : mean offset Consistency check:
Pedestrian Flow Analysis Public dataset: Grand Central Station, NYC: 720x480 pixels, 2000 particles, runs at 35 fps
SHAPE-GUIDED TRACKLET GROUPING FOR COMPACT OBJECTS Optical flow driven particle tracklets Clustering directly performed in the discrete tracklet-domain The i th tracklet: 𝑈 𝑗 = 𝑦 𝑢 , 𝑧 𝑢 𝑢=1..𝑂 , 𝒘, 𝑥 𝒘 – velocity vector 𝑥 – weight (scalar) STEP 1: sampling STEP 2: weight generation STEP 4: find nearest STEP 3: local shape, scale from orientation similarity tracklet to mode estimate and center estimation (w.r.t. center tracklet) W Estimated cluster parameters + mode location repeat from STEP 1 until Single parameter: convergence W – initial scale
3D STEREO DEPTH INFORMATION CHARACTERISTICS AND USAGE 20
PASSIVE STEREO BASED DEPTH MEASUREMENT ▪ 3D stereo-camera system developed by AIT ▪ Area-based, local-optimizing, correlation- based stereo matching algorithm ▪ Specialized variant of the Census Transform ▪ Resolution: typically ~1 Mpixel ▪ Run-time: ~ 14 fps (Core-i7, multithreaded, SSE-optimized) ▪ Excellent “depth -quality-vs.-computational- costs” ratio ▪ USB 2 interface Advantage: • Depth ordering of people • Robustness against illumination, shadows, • Enables scene analysis
STEREO CAMERA CHARACTERISTICS Trinocular setup: ▪ 3 baselines possible ▪ 3 stereo computations with results fused into one disparity image far-range near-range small medium 22 large baseline
Data characteristics Disparity image Intensity image y Planar surface in 3D space y ( x,y ) image coordinates, d disparity d(x,y) 23 d
2.5D vs. 3D algorithmic approaches Computed top view of the 3D point cloud 3D approach Height (world) Stereo setup 2.5D approach noisy measurement correct measurement Ground plane (world)
LEFT ITEM DETECTION Additional knowledge (compared to existing video analytics solutions): • Stationary object (Geometry introduced to a scene) • Object geometric properties (Volume, Size) • Spatial location (on the ground)
METHODOLOGY INTENSITY Change detection Ortho-transform Background model Object detection and validation in the ortho- Ground plane Ortho-map map estimation generation Stereo disparity Combination of proposals DEPTH + Final Input images Processing intensity and depth data candidates Validation
Left Item Detection – Demos 14.07.2019 27
Clustering in discrete two-dimensional distributions 28
Object detection as clustering 14.07.2019 29 (a)
A Frequently Occurring Task Analysis of discrete two-dimensional distributions … … LEARNED CODEBOOK
EXAMPLES ▪ Description of the Binary-Shape-driven 2D clustering • Shape learning • Shape clustering, delineation ▪ Results • Occupancy map clustering • Text line delineation • Object delineation by shape-guided tracklet grouping 14.07.2019 31
TASK DEFINITION Intermediate probabilistic representations 2D distributions generate consistent Local grouping object hypotheses prior, structure-specific knowledge Challenge: ▪ arbitrarily shaped distributions ▪ multiple nearby modes ▪ noise, clutter Definitions: mode = location of maximum density computed using a kernel K density estimation of variable x 𝑔 𝑦 = 𝐿 𝑏 − 𝑦 𝑥(𝑏) 𝑏 32 32
Shape learning Shape learning – Case: Compact clusters 1. Binary mask from manual annotation or Spatial resolution of local structure from synthetic data 2. Sampling using an analysis window discretized into a n i × n i grid 3. Building a codebook of binary shapes with a coarse-to-fine spatial resolution Mode-centered samples M Off-the-mode samples Codebook:
Example Codebook – Case: Compact clusters 14.07.2019 34
Shape learning Shape learning – Case: Line structures Spatial resolution of local structure Binary mask from manually annotated high low mid text lines Codebook:
Shape delineation Shape delineation – I. Step 1: Fast Mode Seeking Three integral images: and Mode location: COMPACT CLUSTERS LINE STRUCTURES Step 2: Local density analysis Density measure for each resolution level for the binary structure Enumerating all binary shapes at each resolution level → Finding best matching entry:
Recommend
More recommend