distracted driver detection
play

Distracted Driver Detection CAN COMPUTER VISION SPOT DISTRACTED - PowerPoint PPT Presentation

Distracted Driver Detection CAN COMPUTER VISION SPOT DISTRACTED DRIVERS? BY: CESAR HIERSEMANN Image understanding is hard! Easy for humans, hard for computers Relevant XKCD (posted in 2014) http://xkcd.com/1425/ Outline


  1. Distracted Driver Detection CAN COMPUTER VISION SPOT DISTRACTED DRIVERS? BY: CESAR HIERSEMANN

  2. Image understanding is hard! • ”Easy for humans, hard for computers” • Relevant XKCD (posted in 2014) http://xkcd.com/1425/

  3. Outline Problem introduction ● Theory ● Neural Networks – ConvNets – Deep Pre-trained with example – My approach ● Challenges ● Results ●

  4. Distracted Drivers competition 1 Kaggle – Data science competitions ● Dataset: ● Over 100 000 images (>4 Gb) – 100 persons performing 10 different actions (next slides) – Labelled training set with ~20K images, test set ~80K – Task is to label test set with probabilities for each class ● Evaluation by multi-class logloss: ● N M L =− 1 N ∑ ∑ y ij log ( p ij ) i = 1 j = 1 [1]: https://www.kaggle.com/c/state-farm-distracted-driver-detection

  5. Action classes C1: ● C0: ● Texting right Driving safely C3: ● C2: ● Texting left Talking right

  6. Action classes cont. C5: ● C4: ● Operating Talking left radio C7: ● C6: ● Reaching Drinking back

  7. Action classes cont. C9: ● C8: ● Talking to Hair and makeup passenger

  8. Neural networks • One node with sigmoid activation = logistic regression • Many nodes/layers → learns complex input/output relations with cheap operations Demo 2 : Link [2]: Tensorflow Playground: http://playground.tensorflow.org/

  9. ConvNets Sharpening filter • Convolution (”faltning”) Fourier/Laplace transform – Image analysis – Signal Processing – • Filter on images • Ex: – Gaussian Blur Sharpening – – Edge detection • ConvNets include convolutional layers

  10. Deep ConvNet, VGG16 3 • 16 conv. Layers + 4 fully connected (”normal”) layers VGG16 architecture • > 138 million parameters • 2-3 weeks to train on ImageNet database • 1.3 million images from 1000 classes [3]: VGG-16 network [http://arxiv.org/abs/1409.1556]

  11. VGG16 Demo • Giant Panda image from Hong Kong Zoo • VGG16 gives output: • 99.9999% confidence in class 388: giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca

  12. Back to the drivers! Use pre-trained VGG16 to ● extract feature-vectors from images Use first layer after the ● convolutions, produces 4096-dimensional vector Every image takes 0.5s to ● process → ~20h on laptop

  13. Will it work? Seperability of classes ● Max activations Class 0 Mean output over different ● classes Seemed to show good ● Max variability → good chance of activations seperation Class 9 Promising! ●

  14. Classification challenges Many similar ● images taken within short timeframes → prone to overfitting Seperate persons ● in train and test set Two similar images from C0: safe driving Network learned ● person-specifics → bad results on test!

  15. Labelled cross-validation To recieve accurate test ● evaluations, cross-validation is required 26 different persons in train set ● Split my training set into 5 folds ● with 5 persons held out from training

  16. Classification Now I had a: ● train matrix 22424 x 4096 – test matrix 79726 x 4096 – Many approaches to classification: ● Support vector machine – Logistic regression – Random forest – Decision Trees – Gradient Boosting – SVM and Log.Reg produced best res. ● (implemented in scikit-learn)

  17. Training Using the entire 4096 feature vector ● for every image (testing took time!) Regularization: ● Prevents overfitting by limiting size – of weights Train (blue) and validation An additional hyperparameter to – (red) acc. (top) and logloss optimize (bottom) Finding the right hyperparameters ● using cross-validation

  18. Improvements 60-65% accuracy, 1.10 logloss → ● ~250 on current leaderboards Wanted less features per image ● Reduces training time – more time to ● optimize hyperparameters Finding the ”right” features for my specific ● task will greatly prevent overfitting

  19. Dimensionality reduction Which features were the most important ● Removing features that coded for person- ● specifics Ended up with 887 feature vector → much ● faster training/testing and easier on the memory

  20. Final Results Over 80% accuracy and <0.60 logloss on cross- ● validation! Sadly nowhere close to <0.2 logloss (top of LB) :( ●

  21. Thanks! Dennis Medved ● Pierre Nugues ● Magnus Oskarsson ● Have a great summer! ●

Recommend


More recommend