Autonomous Driving on Benchmarks Xiaodi Hou
TWO DECADES OF BENCHMARKING
Two decades of benchmarking • MNIST – 1998 – Character recognition – 60,000 images • Inspired Convolutional Neural Net
Two decades of benchmarking • PASCAL-VOC – 2005 – Object detection & classification – 3787 images • Inspired Deformable Part- based Model
Two decades of benchmarking • ImageNet – 2010 – Object classification – 1,000,000 images • Inspired deep learning
LIMITATIONS OF BENCHMARKS
Upper bounds of benchmarks • Measuring physical reality Objective tasks • Bounded by measurement accuracy • Stereo/Optical flow/Face Intermediate tasks recognition • Measuring human cognition • Bounded by subject agreement Subjective tasks • Saliency/Memorability/Image captioning
Imperfect benchmarks • Marriage market in China • Red or Blue? – Tall, rich, and handsome • 80% girls are forced to choose among – tall poor ugly guy – short rich ugly guy – short poor handsome guy • Dimensionality reduction – Guaranteed information loss! – A projection of 𝑺 𝒐 → 𝑺
Signs of a fading benchmark • Saturated competition – Labeled Face in the Wild (0.9978 ± 0.0007) • Weak transferability – Middlebury Optical Flow → KITTI Optical Flow • Poor inert-subject consistency – Image captioning and BLEU scores • A man throwing a frisbee in a park. • A man holding a frisbee in his hand. • A man standing in the grass with a frisbee.
BENCHMARKS AND AUTONOMOUS DRIVING
Vision-based autonomous driving benchmarks • Are we ready? • KITTI & CityScapes – Detection – Tracking – Stereo/Flow – SLAM – Semantic segmentation • 100% traditional vision challenges
Not yet…
Challenge 1: Data distribution • Academia – Average performance • Silicon valley startup – Demo oriented – Best case performance • Real products – Murphy’s law – Worst case performance
Challenge 2: Gruond-truth representation • Bbox • Semantic segmentation • Stixels – Almost no bbox in real – “pixel classification” – Representing the world world! – How to assemble all the using matchstick – Missing hidden variables – Distance and 3D shape pixels? (distance & velocity) – Missing the notion of whole objects
Challenge 3: Structured prior • What’s wrong with end -to-end learning?
Challenge 3: Structured prior • Two types of priors: – Implicit prior • Data driven (e.g. images) • Good for deep learning models – Explicit prior • Rule driven (e.g. cars cannot fly) • Good for probabilistic models • The road ahead – An image based problem with strong explicit priors
TUSIMPLE CHALLENGES! WORKSHOP@CVPR 2017
TuSimple Challenge 1: Lane challenge
TuSimple Challenge 1: Lane challenge • Deep learning for lane? – Parametrization of pixels • Strong structure priors – ~ 3.75m lane width – Parallel lines – (almost) flat road surface • Over-representing corner cases – 20% hard cases (heavy occlusion/strong light condition change/bad markings) are unlikely to occurs, if sampled uniformly
TuSimple Challenge 2: Velocity estimation • Representing the world with cam + LiDAR
TuSimple Challenge 2: Velocity estimation • Object-level representation for motion planning – Stereo map? – SLAM? – Estimation based on bbox size? • LiDAR vs Camera – No LiDAR solution for 200m perception
TuSimple challenges • Video clip based – We expect non-trivial temporal aggregation! • Confidence based – Each entry has a “confidence” field – We evaluate the most confident 80% entries • Run-time – Must report single GPU runtime speed – Slow algorithms (< 3fps) will not be included in the leaderboard
Available now!! HTTP://BENCHMARK.TUSIMPIE.AI
Xiaodi Hou
Recommend
More recommend