tensorrt optimizations for
play

TensorRT Optimizations for Embedded Facial Recognition Alexey - PowerPoint PPT Presentation

TensorRT Optimizations for Embedded Facial Recognition Alexey Kadeishvili, CTO, Vocord Vocord Company: Main Facts Developer of video surveillance and video analytics systems since 1999 Deep expertise in facial recognition


  1. TensorRT Optimizations for Embedded Facial Recognition Alexey Kadeishvili, CTO, Vocord

  2. Vocord Company: Main Facts ■ Developer of video surveillance and video analytics systems since 1999 ■ Deep expertise in facial recognition ■ Top-rated in NIST and Megaface face recognition tests ■ NVIDIA Metropolis program member Our customers and partners www.vocord.com 2

  3. Notable figures 250+ projects for public and private sectors 140 million faces in enrollment database in a single project 200,000 cameras are managed by VOCORD video analysis software 350,000/month API request to VOCORD FaceMatica cloud Geography: Europe, Middle East, SE Asia, East Asia, Latin America, Oceania www.vocord.com 3

  4. Face recognition products VOCORD FaceControl VOCORD FaceMatica Face Recognition SDK “Faces in the crowd” FR system Face recognition engine Face recognition engine SDK in a Cloud VOCORD NetCam nano VOCORD NanoFace VOCORD FaceControl 3D NVIDIA Jetson-based New generation face Free flow 3D facial recognition recognition camera embedded face recognition solution All products support NVIDIA GPU www.vocord.com 4

  5. Main Factors Impacting Facial Recognition Enrolment DB quality: something beyond control Inbound Enrolment DB image quality Recognition engine Recognition engine: already works as in the Marvel movies www.vocord.com 5

  6. VOCORD Facial Recognition Engine TOP in Megaface Face Scrub Open Challenge 2015-2018 With accuracy 91.76% TOP in NIST Face Recognition Vendor Test 2016-2018 TPR at FPR 10 -4 = 98.7%, TPR at FPR 10 -6 = 96.6% www.vocord.com 6

  7. Cross Nation Invariance Source: NIST Face recognition vendor test, 2018 www.vocord.com 7

  8. Pose Invariance 0.25 Enrollment DB <30˚ Group 1 < 10˚ <10˚ 10 ÷ 30˚ 0.2 30 ÷ 45˚ Group 2 45 ÷ 60˚ 10 ÷ 30˚ > 60˚ 0.15 FRR > 60˚, enrollment DB >60˚ Group 3 30 ÷ 45˚ 0.1 Group 4 45 ÷ 60˚ 0.05 Group 5 > 60˚ 0 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E00 FAR www.vocord.com 8

  9. Image Resolution Impact 1.0 0.95 True Identification Rate** Face identification probability 0.9 Recommended minimum Optimal resolution 0.85 0.8 L=48 pix L =24 pix 0.75 0.7 12 24 36 48 60 72 Pixels between eyes (L) *L – the distance between eyes, pix ** FAR=10 - 4 www.vocord.com 9

  10. How to improve recognition? The quality of acquired face Enrollment DB quality: images: point of growth something beyond control Inbound Enrollment Image DB Quality Recognition Engine Recognition engine: already works as in the Marvel movies www.vocord.com 10

  11. Different types of test datasets NIST FRVT Report 2017 10 03 www.vocord.com 11

  12. “Controlled” dataset Algorithm A Algorithm B NIST FRVT Report 2017 10 03 www.vocord.com 12

  13. “Uncontrolled” dataset Algorithm A Algorithm B NIST FRVT Report 2017 10 03 www.vocord.com 13

  14. Controlled vs. Uncontrolled (FRR log scale) Algorithm A, 0.7 uncontrolled environment Algorithm B, 0.6 uncontrolled environment Algorithm A, 0.5 FRR controlled environment Algorithm B, 0.4 controlled environment 0.3 0.2 0.1 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 FAR www.vocord.com 14

  15. Controlled vs. Uncontrolled (linear scale) Algorithm A, 0.7 uncontrolled environment Algorithm B, 0.6 uncontrolled environment Algorithm A, 0.5 FRR controlled environment Algorithm B, 0.4 controlled environment 0.3 0.2 0.1 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 FAR www.vocord.com 15

  16. Hit the bottom: Images from IP camera

  17. The Advantages of Edge Video Analysis ■ Face recognition onboard ■ No compression artifacts: the image is taken directly from the sensor ■ Dynamic Region of Interest for every intelligent algorithm ■ Algorithm adjustment for particular camera set up VOCORD NetCam.AI edge video analytics camera www.vocord.com 17

  18. Video Enhancement Onboard Dynamic ROI enhances the quality of image in the face area Backlight, no 12 bit image 12 bit image with enhancement with static ROI dynamic ROI 18

  19. VOCORD NetCam.AI HW Features High quality sensor Automated lens control NVIDIA Jetson TX1 GPU www.vocord.com 19

  20. VOCORD NetCam.AI Tech Specs Camera specs 3 ÷ 5 Mpix Resolution -25 С ~ +50С Temperature range Ingress Protection IP 67 Dimensions 20x71x150 mm Power consumption 15W Built-in facial recognition engine specs Min face resolution for face recognition 12 pixels between the eyes Number of faces detected in one frame Up to 25 Latency of biometric template extraction Up to 150 ms per 1 face Face recognition performance Up to 32 faces/s Inference framework TensorRT www.vocord.com 20

  21. Performance on Different Platforms 35 32 NVIDIA Jetson TX1 30 Intel Movidius Qualcom Snapdragon 820 25 19 20 15 12 9 10 6 4 5 2,2 1,4 0,9 0 "Shallow" CNN "Medium" CNN "Deep" CNN www.vocord.com 21

  22. Higher FPS Improves Accuracy 0.15 Single face: 0.13 “Deep” CNN “Medium” CNN 0.11 ”Shallow” CNN 0.09 Track (multiple faces): FRR “Deep” CNN 0.7 “Medium” CNN 0.5 ”Shallow” CNN 0.03 0.01 0 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 FAR www.vocord.com 22

  23. TensorRT vs. MXNet Performance 35 MXNet 32 TensoRT 30 25 20 19 FPS 18 15 12 10 10 6 5 0 “Shallow” CNN “Very” CNN “Medium” CNN Platform: NVIDIA Jetson TX1 www.vocord.com 23

  24. WHAT’S THE PROFIT? www.vocord.com 24

  25. Face recognition systems architectures “Traditional” server architecture approach Edge analytics system VS with VOCORD NetCam.AI cameras with regular IP-cameras Data center with many expensive rack One archive server servers LAN, Wi-Fi LAN 95% of processing is here 95% of processing is here 25

  26. Cost-Efficiency: 100 High Loaded Cameras “Traditional” server architecture with IP cameras Edge computing with VOCORD NetCam.AI VS Cameras Cameras USD 2,000 x 100 = USD 200,000 USD 500 x 100 = USD 50,000 Server for matching and archive Servers USD 10,000 Detection: 2 servers, 4xCPU 32 cores each USD 60,000 Template extraction: 4 servers, 2 GPU Tesla P40 each USD 120,000 Server for matching and archive USD 10,000 CAPEX: USD 210,000 CAPEX: USD 240,000 Maintenance costs: Maintenance costs: power supply (800 Wt), bandwidth (2Gbps), rack space power supply (7-8 kWt), bandwidth (2Gbps), rack space OPEX: USD 30,000 per year OPEX: USD 2,000 per year www.vocord.com 26 26

  27. WHAT’S NEXT ? • Uploading various video analytics algorithms • Highly customized algorithms • Interacting cameras as a part of IoT • 3D vision www.vocord.com 27

  28. Open Platform: Easy Algorithm Uploading Facial recognition Behavioral License plate analysis recognition Vehicle types Emergency cases Lost and found objects www.vocord.com 28

  29. Camera-Dependent Algorithm Customization Step 1. The camera Step 2. The neural network collects images and is retrained on the server uploads them to the server using new images Step 3. Customized, light-weight neural network is uploaded back to the camera www.vocord.com 29

  30. Customization to restricted data Unrestricted data Restricted data 0.04 0.04 0.035 0.035 “Deep” neural network “Deep” neural network 0.03 “Shallow” nueral network 0.03 “Shallow” neural network 0.025 0.025 FRR FRR 0.02 0.02 0.015 0.015 0.01 0.01 0.005 0.005 0 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-07 1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 FAR FAR Deeper DNNs provide better On restricted data difference between deep and shallow performance on unrestricted data network is negligible www.vocord.com 30

  31. Intercamera Tracking Face Bag NetCam.AI #2 NetCam.AI #1 Jeans www.vocord.com 31

  32. Obtaining 3D Models ■ Building a 3D object from synchronous snapshots from multiple cameras ■ Feature preprocessing for conjugate points search � � www.vocord.com 32

  33. Thank you for your attention! Questions? E-mail: sales@vocord.com Website: www.vocord.com

Recommend


More recommend