breakthroughs in face recognition capability via deep
play

Breakthroughs in Face Recognition Capability via Deep Learning and - PowerPoint PPT Presentation

Breakthroughs in Face Recognition Capability via Deep Learning and GPUs Prof. Neil M. Robertson Dr. Steven Lu Dr. Guosheng Hu Dr. Sankha Mukherjee , Dr. Rolf Baxter Dr. Yang Hua Dr. Yuan Yang Dr. Elyor Qodirov Soumya Ghosh Face Recognition


  1. Breakthroughs in Face Recognition Capability via Deep Learning and GPUs Prof. Neil M. Robertson Dr. Steven Lu Dr. Guosheng Hu Dr. Sankha Mukherjee , Dr. Rolf Baxter Dr. Yang Hua Dr. Yuan Yang Dr. Elyor Qodirov Soumya Ghosh

  2. Face Recognition that really works Face recognition in unconstrained scenes is hard • Changes in pose, lighting makes matching faces from different sources very • difficult (e.g. passport to cctv) Recognition, especially at a large scale, is even harder • Anyvision technology matches millions of identities across the range of appearances

  3. Pipeline Discriminative Face Face Feature Tracking DB Search Detection Alignment Extraction 40x faster on GPU compared to CPU alone

  4. Face Detection in SD -> 4k Huge impact on the system performance and speed. ▶ High recall and precision on real situation - pose, image quality, illumination ▶ Good methods are not fast enough ▶ OTS DL methods(Faster RCNN … YOLO) are all fully convolutional networks ▶ they can not save any computation via early rejection ▶ Fast methods not accurate enough ▶ Traditional methods e.g. Haar, cascade method, sliding window ▶ Cascade structure is essential for speed ▶ image is large ▶ reject false positive in early stage ▶ reduces computation at later stages. ▶

  5. Face Detection Reuse feature map to generate multi scale proposals (scale1, scale2) Reuse feature map to filter out false positive detections in early stage (det->stage1->stage2->out)

  6. 8 ms on Quadro/Tesla (at HD)

  7. Data, algorithms and computation must be treated holistically

  8. Data collection and cleanup ● Use a small number of clean annotations from images and videos covering a wide range of pose, lighting and resolution ● Dynamic clustering and ranking to clean up millions of images and video using the latest best trained model ● Image attributes and probabilistic graphical models used to quantify weakness of the features from the net ● Invariant to pose, lighting, facial hair, hairstyle, glasses ● Male vs. Female distinct ● Virtuous cycle of data clean up using current best net and the graphical model

  9. 3D Data Augmentation ● Decompose non-linear (non-convex) cost functions into linear (convex) ones ● Closed-form solutions ● Perspective camera ● Phong illumination model ● Contour landmarks ● Efficient and accurate

  10. 3D Data Augmentation

  11. 3D Data Augmentation

  12. Generative approaches to missing data • Generator and Discriminator autoencoders • Loss function is difference of the encoding loss distribution between the original image and the generated image • Loss = F(|A(i_real)-A(i_gen)|) [A=encoding loss of discriminator, i_real=real image, i_gen=generated image]

  13. Generative Approach- Face Enhancement Input Ground Truth Generated

  14. Attribute feature fusion

  15. Tensor Fusion- Gated Two Stream Networks Optimisation Tucker Decomposition

  16. Feature Invariance – Data and Algorithmic Harmony Blurry (Blue) Not Blurry(Red) Glasses(Blue) No Glasses(Red) Goatee Moustache Facial Hair (Blue) No Facial Hair (Red)

  17. Our Nets Contain Attributes Female(Red) Male(Blue) Young (Blue) Not Young(Red)

  18. DB Search

  19. Using GPU

  20. Real World Testimony ● We are deployed right now all over the world ● The system runs real-time on 5 Megapixel video streams. ● We run 10+ cameras on a single commercial GPU A workstation can support multiple GPUs … scale up ● From a current deployment in high profile, high security locations 120K people every 3 days pass through the system all in real time 50 people in the watchlist 99.9% recognition accuracy <1 false positive per day

  21. Pose Invariant Recognition NIST rank #1 IJB-A

  22. Results- Pose

  23. Illumination and Expression

  24. NIR Cross-Modality

  25. Labeled Faces in the Wild

  26. “The numbers make it tactical”

  27. We are hiring in the UK Algorithms engineers Machine learning specialists Hardware experts - GPU/FPGA info@anyvision.co.uk

Recommend


More recommend