Breakthroughs in Face Recognition Capability via Deep Learning and GPUs Prof. Neil M. Robertson Dr. Steven Lu Dr. Guosheng Hu Dr. Sankha Mukherjee , Dr. Rolf Baxter Dr. Yang Hua Dr. Yuan Yang Dr. Elyor Qodirov Soumya Ghosh
Face Recognition that really works Face recognition in unconstrained scenes is hard • Changes in pose, lighting makes matching faces from different sources very • difficult (e.g. passport to cctv) Recognition, especially at a large scale, is even harder • Anyvision technology matches millions of identities across the range of appearances
Pipeline Discriminative Face Face Feature Tracking DB Search Detection Alignment Extraction 40x faster on GPU compared to CPU alone
Face Detection in SD -> 4k Huge impact on the system performance and speed. ▶ High recall and precision on real situation - pose, image quality, illumination ▶ Good methods are not fast enough ▶ OTS DL methods(Faster RCNN … YOLO) are all fully convolutional networks ▶ they can not save any computation via early rejection ▶ Fast methods not accurate enough ▶ Traditional methods e.g. Haar, cascade method, sliding window ▶ Cascade structure is essential for speed ▶ image is large ▶ reject false positive in early stage ▶ reduces computation at later stages. ▶
Face Detection Reuse feature map to generate multi scale proposals (scale1, scale2) Reuse feature map to filter out false positive detections in early stage (det->stage1->stage2->out)
8 ms on Quadro/Tesla (at HD)
Data, algorithms and computation must be treated holistically
Data collection and cleanup ● Use a small number of clean annotations from images and videos covering a wide range of pose, lighting and resolution ● Dynamic clustering and ranking to clean up millions of images and video using the latest best trained model ● Image attributes and probabilistic graphical models used to quantify weakness of the features from the net ● Invariant to pose, lighting, facial hair, hairstyle, glasses ● Male vs. Female distinct ● Virtuous cycle of data clean up using current best net and the graphical model
3D Data Augmentation ● Decompose non-linear (non-convex) cost functions into linear (convex) ones ● Closed-form solutions ● Perspective camera ● Phong illumination model ● Contour landmarks ● Efficient and accurate
3D Data Augmentation
3D Data Augmentation
Generative approaches to missing data • Generator and Discriminator autoencoders • Loss function is difference of the encoding loss distribution between the original image and the generated image • Loss = F(|A(i_real)-A(i_gen)|) [A=encoding loss of discriminator, i_real=real image, i_gen=generated image]
Generative Approach- Face Enhancement Input Ground Truth Generated
Attribute feature fusion
Tensor Fusion- Gated Two Stream Networks Optimisation Tucker Decomposition
Feature Invariance – Data and Algorithmic Harmony Blurry (Blue) Not Blurry(Red) Glasses(Blue) No Glasses(Red) Goatee Moustache Facial Hair (Blue) No Facial Hair (Red)
Our Nets Contain Attributes Female(Red) Male(Blue) Young (Blue) Not Young(Red)
DB Search
Using GPU
Real World Testimony ● We are deployed right now all over the world ● The system runs real-time on 5 Megapixel video streams. ● We run 10+ cameras on a single commercial GPU A workstation can support multiple GPUs … scale up ● From a current deployment in high profile, high security locations 120K people every 3 days pass through the system all in real time 50 people in the watchlist 99.9% recognition accuracy <1 false positive per day
Pose Invariant Recognition NIST rank #1 IJB-A
Results- Pose
Illumination and Expression
NIR Cross-Modality
Labeled Faces in the Wild
“The numbers make it tactical”
We are hiring in the UK Algorithms engineers Machine learning specialists Hardware experts - GPU/FPGA info@anyvision.co.uk
Recommend
More recommend