deep unconstrained gaze estimation with synthetic data
play

DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De - PowerPoint PPT Presentation

DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De Mello, Rajeev Ranjan, Jan Kautz NVIDIA AI CO-PILOT 2 APPLICATIONS INTERFACE DESIGN AR/VR ACCESSIBILITY 3 TRADITIONAL GAZE TRACKERS Fovea Cornea C E Sclera Pupil C c ,


  1. DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De Mello, Rajeev Ranjan, Jan Kautz

  2. NVIDIA AI CO-PILOT 2

  3. APPLICATIONS INTERFACE DESIGN AR/VR ACCESSIBILITY 3

  4. TRADITIONAL GAZE TRACKERS Fovea Cornea C E Sclera Pupil C c Θ , Φ Light source Z-axis X-axis Point of regard Y-axis 4

  5. TRADITIONAL GAZE TRACKER 5

  6. REMOTE, CHEAP, UNCONSTRAINED GAZE TRACKING 6

  7. CHALLENGES Unconstrained gaze tracking RESOLUTION 7

  8. CHALLENGES Unconstrained gaze tracking RESOLUTION LIGHTING 8

  9. CHALLENGES Unconstrained gaze tracking RESOLUTION LIGHTING SUBJECT VARIABILITY 9

  10. CHALLENGES Unconstrained gaze tracking RESOLUTION LIGHTING SUBJECT VARIABILITY HEAD ROTATION 10

  11. APPEARANCE-BASED GAZE ESTIMATION* Gaze tracking *Zhang et al., IEEE CVPR 2015. 11 *Krafka et al., IEEE CVPR 2016.

  12. LABELED DATA COLLECTION Gaze tracking 12

  13. LABELED DATA COLLECTION Gaze tracking Indoors only Occlusion Cumbersome 13

  14. LABELED DATA COLLECTION Gaze tracking 14

  15. LABELED DATA COLLECTION Gaze tracking Indoors only Limited head rotations and gazes 15

  16. GPU TO THE RESCUE Gaze tracking Deep Learning Computer Graphics 16

  17. GPU TO THE RESCUE Gaze tracking Deep Learning Computer Graphics 17

  18. SYNTHETIC DATA 18

  19. SYNTHETIC IMAGES Head models 19

  20. COMPUTER GRAPHICS EYE MODEL* INSERT BLENDER IMAGE *Wood et al., IEEE ICCV 2015. 20

  21. 1 MILLION SYNTHETIC IMAGES 21

  22. GAZE CNN ARCHITECTURE Zhang et al., 2015 (5 layers) Gaze pitch Gaze yaw Head pitch Head yaw 22

  23. SYNTHETIC IMAGES Results on MPII data AUTHOR DATA ERROR (º) Wood et al., 2015 UT Multiview 1M 9.68 Wood et al., 2016 UnityEyes 1M 9.95 Wood et al., 2015 SynthesEyes 12K 8.94 Ours SynthesEyes 1M 7.74 23

  24. EYE POINTS CNN ARCHITECTURE Trained with 1M synthetic data x 1 y 1 … x n y n 24

  25. EYE FIDUCIAL POINTS Results on MPII gaze data 25

  26. GAZE ESTIMATION NETWORK 26

  27. GAZE CNN ARCHITECTURE Render for CNN (Su et al., 2015) 27

  28. GAZE CNN ARCHITECTURE Zhang et al., 2015 (5 layers) Gaze pitch Gaze yaw Head pitch Head yaw 28

  29. GAZE CNN ARCHITECTURE Ours (8 layers) Render for CNN Gaze pitch Gaze yaw Head pitch Head yaw 29

  30. GAZE CNN ARCHITECTURE Results on 1M synthetic data NETWORK INITIALIZATION ERROR (º) LeNet Random 5.57 ImageNet AlexNet 5.03 (object recognition) ImageNet ResNet-50 5.07 (object recognition) Render for CNN Ours 4.4 (viewpoint estimation) 30

  31. GAZE CNN ARCHITECTURE Inputs and outputs Render for CNN Gaze pitch Gaze yaw Head pitch Head yaw 31

  32. GAZE CNN ARCHITECTURE Results on 1M synthetic data INPUT OUTPUT ERROR (º) Eye Eye-in-head 5.05 Eye Gaze 5.66 Eye, head pose Eye-in-head 4.4 Eye, head pose Gaze 4.4 32

  33. HEAD ROTATION Eye appearance Zero head yaw Negative head yaw Positive head yaw 33

  34. HEAD ROTATION Gaze distribution (1M synthetic data) 3 5 4 7 2 6 Gaze pitch 1 Gaze yaw 34

  35. HEAD ROTATION Gaze distribution (45k MPII data) 1 0.5 1 1 Pose pitch 5 0 4 2 5 2 4 3 Gaze pitch -0.5 3 -1 -1 -0.5 0 0.5 1 Gaze yaw Pose yaw 35

  36. HEAD ROTATION Head pose separation cluster 1 Render for CNN Gaze pitch Gaze yaw … Gaze pitch Gaze yaw Head pitch Head pitch Head yaw cluster n Head yaw 36

  37. HEAD ROTATION Results on 1M synthetic data INPUT CNN ERROR (º) Eye single fc7-8 5.66 Eye branched fc7-8 5.18 Eye, head pose single fc7-8 4.4 Eye, head pose branched fc7-8 4.26 37

  38. GAZE CNN ARCHITECTURE Results on 1M synthetic data Head pitch 5 Single Branched Head yaw 4.75 4.5 4.25 4 3.75 Error (º) 3.5 1 2 3 4 5 6 7 Head pose clusters 38

  39. CNN ARCHITECTURE Skip connections cluster 1 Render for CNN Gaze pitch Gaze yaw … Gaze pitch Gaze yaw Pead pitch Pead yaw + cluster n 39

  40. GAZE CNN ARCHITECTURE Results on 1M synthetic data INPUT CNN ERROR (º) Eye, head pose single fc7-8 4.4 Eye, head pose branched fc7-8 4.26 branched fc7-8, Eye, head pose 4.15 skip connections 40

  41. REAL DATA Columbia cluster 1 Render for CNN Gaze pitch Gaze yaw … Gaze pitch Gaze yaw Pead pitch Pead yaw + cluster n 41

  42. GAZE ERROR Columbia 8 All No Glasses 7.54 7.5 7 6.68 6.5 6.26 6 5.65 5.58 5.5 5 4.5 Error (º) 4 Wood et al., 2015 Our CNN Our CNN with Synthetic data 42

  43. REAL DATA MPII gaze cluster 1 Render for CNN Gaze pitch Gaze yaw … Gaze pitch Gaze yaw Pead pitch Pead yaw + cluster n 43

  44. GAZE ERROR MPII gaze 8 7.5 7 6.5 6.3 6 5.85 5.58 5.5 5 Error (º) 4.5 4 Zhang et al., 2015 Our CNN Our CNN with Synthetic data 44

  45. CONCLUSION 45

  46. CHALLENGES Unconstrained gaze tracking RESOLUTION LIGHTING SUBJECT VARIABILITY HEAD ROTATION 46

  47. GPU TO THE RESCUE Unconstrained gaze tracking Deep Learning Computer Graphics 47

Recommend


More recommend