in in live computer vision
play

In In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren - PowerPoint PPT Presentation

EVA 2 : Exploiting Temporal Redundancy In In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018 Convolutional Neural Networks


  1. EVA 2 : Exploiting Temporal Redundancy In In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018

  2. Convolutional Neural Networks (CNNs) 2

  3. Convolutional Neural Networks (CNNs) 3

  4. FPGA Research ASIC Research Embedded Vision Accelerators Zhang et al. Suda et al. ShiDianNao Eyeriss Qiu et al. Farabet et al. EIE SCNN Many more… Many more… Industry Adoption 4

  5. Temporal Redundancy Frame 3 Frame 0 Frame 1 Frame 2 Input Change Low High Low Low 5

  6. Temporal Redundancy Frame 3 Frame 0 Frame 1 Frame 2 Input Change Low High Low Low Cost to High High High High Process 6

  7. Temporal Redundancy Frame 3 Frame 0 Frame 1 Frame 2 Input Change Low High Low Low Cost to High High High High Process Low Low Low 7

  8. Talk Overview Background Algorithm Hardware Evaluation Conclusion 8

  9. Talk Overview Background Algorithm Hardware Evaluation Conclusion 9

  10. Common Structure in CNNs Image Classification Object Detection Semantic Segmentation Image Captioning 10

  11. Common Structure in CNNs Intermediate Activations CNN CNN Frame 0 Prefix Suffix High energy Low energy CNN CNN Frame 1 Prefix Suffix High energy Low energy #MakeRyanGoslingTheNewLenna 11

  12. Common Structure in CNNs Intermediate Activations CNN CNN “Key Frame” Prefix Suffix High energy Low energy ≈ Motion Motion CNN CNN “Predicted Frame” Prefix Suffix High energy Low energy #MakeRyanGoslingTheNewLenna 12

  13. Common Structure in CNNs Intermediate Activations CNN CNN “Key Frame” Prefix Suffix High energy Low energy ≈ Motion Motion CNN CNN “Predicted Frame” Prefix Suffix Low energy #MakeRyanGoslingTheNewLenna 13

  14. Talk Overview Background Algorithm Hardware Evaluation Conclusion 14

  15. Activation Motion Compensation (AMC) Time Input Frame Vision Computation Vision Result Stored Activations Key CNN CNN t Frame Prefix Suffix Predicted Motion Motion CNN t+k Frame Estimation Compensation Suffix Motion Predicted Vector Field Activations 15

  16. Activation Motion Compensation (AMC) Time Input Frame Vision Computation Vision Result Stored Activations Key CNN CNN t Frame Prefix Suffix ~10 11 MACs Predicted Motion Motion CNN t+k Frame Estimation Compensation Suffix ~10 7 Adds Motion Predicted Vector Field Activations 16

  17. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 17

  18. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 18

  19. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 19

  20. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? ? 20

  21. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 21

  22. Motion Estimation • We need to estimate the motion of activations by using pixels … CNN CNN Prefix Suffix Motion Motion CNN Estimation Compensation Suffix Performed on Performed on Pixels Activations 22

  23. Pixels to Activations 3x3 3x3 Conv Conv Input Intermediate Intermediate 64 64 Image Activations Activations 23

  24. Pixels to Activations: Receptive Fields C=64 C=64 C=3 w=h=8 3x3 3x3 Conv Conv Input Intermediate Intermediate 64 64 Image Activations Activations 24

  25. Pixels to Activations: Receptive Fields C=64 C=64 C=3 w=h=8 5x5 “Receptive Field” 3x3 3x3 Conv Conv Input Intermediate Intermediate 64 64 Image Activations Activations • Estimate motion of activations by estimating motion of receptive fields 25

  26. Receptive Field Block Motion Estimation (RFBME) … … Key Frame Predicted Frame 26

  27. Receptive Field Block Motion Estimation (RFBME) 0 1 2 3 0 1 2 3 Key Frame Predicted Frame 27

  28. Receptive Field Block Motion Estimation (RFBME) 0 1 2 3 0 1 2 3 Key Frame Predicted Frame 28

  29. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 29

  30. Motion Compensation C=64 C=64 Vector: X = 2.5 Y = 2.5 Stored Activations Predicted Activations • Subtract the vector to index into the stored activations • Interpolate when necessary 30

  31. AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? ? 31

  32. When to Compute Key Frame? • System needs a new key frame when motion estimation fails: • De-occlusion • New objects • Rotation/scaling • Lighting changes 32

  33. When to Compute Key Frame? Input Frame Key Frame • System needs a new key frame when motion estimation fails: Motion Estimation • De-occlusion • New objects Yes No Error > • Rotation/scaling Thresh? • Lighting changes CNN Motion Prefix Compensation • So, compute key frame when RFBME error exceeds set threshold CNN Suffix Vision Result 33

  34. Talk Overview Background Algorithm Hardware Evaluation Conclusion 34

  35. Embedded Vision Accelerator Global Buffer Eyeriss EIE (Conv) (Full Connect) CNN CNN Prefix Suffix Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, 35 “ Eyeriss: An energy- efficient reconfigurable accelerator for deep convolutional neural networks,” “EIE: Efficient inference engine on compressed deep neural network,”

  36. Embedded Vision Accelerator Accelerator (EVA 2 ) Global Buffer Eyeriss EIE EVA 2 (Conv) (Full Connect) Motion Motion CNN CNN Estimation Compensation Prefix Suffix Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, 36 “ Eyeriss: An energy- efficient reconfigurable accelerator for deep convolutional neural networks,” “EIE: Efficient inference engine on compressed deep neural network,”

  37. Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 0 37

  38. Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 0: Key frame 38

  39. Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 1 Motion Estimation 39

  40. Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 1: Predicted frame Motion Motion Estimation Compensation • EVA 2 leverages sparse techniques to save 80-87% storage and computation 40

  41. Talk Overview Background Algorithm Hardware Evaluation Conclusion 41

  42. Evaluation Details Train/Validation Datasets YouTube Bounding Box: Object Detection & Classification Evaluated Networks AlexNet, Faster R-CNN with VGGM and VGG16 Hardware Baseline Eyeriss & EIE performance scaled from papers EVA 2 Implementation Written in RTL, synthesized with 65nm TSMC 42

  43. EVA 2 Area Overhead EVA 2 takes up Total 65nm area: 74mm 2 only 3.3% 43

  44. EVA 2 Energy Savings 1 Input Frame 0.9 Normalized Energy 0.8 0.7 CNN 0.6 Prefix 0.5 0.4 0.3 CNN 0.2 Suffix 0.1 0 orig orig orig Vision Result AlexNet Faster16 FasterM Eyeriss EIE EVA^2 44

  45. EVA 2 Energy Savings Input Frame Key Frame 1 Motion 0.9 Normalized Energy Estimation 0.8 0.7 0.6 0.5 Motion 0.4 Compensation 0.3 0.2 0.1 CNN 0 Suffix orig pred orig pred orig pred AlexNet Faster16 FasterM Vision Result Eyeriss EIE EVA^2 45

  46. EVA 2 Energy Savings Input Frame Key Frame 1 Motion 0.9 Estimation Normalized Energy 0.8 0.7 Yes No Error > 0.6 Thresh? 0.5 0.4 0.3 CNN Motion Prefix Compensation 0.2 0.1 0 orig pred avg orig pred avg orig pred avg CNN Suffix AlexNet Faster16 FasterM Eyeriss EIE EVA^2 Vision Result 46

  47. High Level EVA 2 Results Network Vision Task Keyframe % Accuracy Average Latency Average Energy Degredation Savings Savings AlexNet Classification 11% 0.8% top-1 86.9% 87.5% Faster R-CNN VGG16 Detection 36% 0.7% mAP 61.7% 61.9% Faster R-CNN VGGM Detection 37% 0.6% mAP 54.1% 54.7% • EVA 2 enables 54-87% savings while incurring <1% accuracy degradation • Adaptive key frame choice metric can be adjusted 47

  48. Talk Overview Background Algorithm Hardware Evaluation Conclusion 48

  49. Conclusion • Temporal redundancy is an entirely new dimension for optimization • AMC & EVA 2 improve efficiency and are highly general • Applicable to many different… • CNN applications (classification, detection, segmentation, etc) • Hardware architectures (CPU, GPU, ASIC, etc) • Motion estimation/compensation algorithms 49

  50. EVA 2 : Exploiting Temporal Redundancy In In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018

  51. Backup Slides 51

Recommend


More recommend