EVA 2 : Exploiting Temporal Redundancy In In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018
Convolutional Neural Networks (CNNs) 2
Convolutional Neural Networks (CNNs) 3
FPGA Research ASIC Research Embedded Vision Accelerators Zhang et al. Suda et al. ShiDianNao Eyeriss Qiu et al. Farabet et al. EIE SCNN Many more… Many more… Industry Adoption 4
Temporal Redundancy Frame 3 Frame 0 Frame 1 Frame 2 Input Change Low High Low Low 5
Temporal Redundancy Frame 3 Frame 0 Frame 1 Frame 2 Input Change Low High Low Low Cost to High High High High Process 6
Temporal Redundancy Frame 3 Frame 0 Frame 1 Frame 2 Input Change Low High Low Low Cost to High High High High Process Low Low Low 7
Talk Overview Background Algorithm Hardware Evaluation Conclusion 8
Talk Overview Background Algorithm Hardware Evaluation Conclusion 9
Common Structure in CNNs Image Classification Object Detection Semantic Segmentation Image Captioning 10
Common Structure in CNNs Intermediate Activations CNN CNN Frame 0 Prefix Suffix High energy Low energy CNN CNN Frame 1 Prefix Suffix High energy Low energy #MakeRyanGoslingTheNewLenna 11
Common Structure in CNNs Intermediate Activations CNN CNN “Key Frame” Prefix Suffix High energy Low energy ≈ Motion Motion CNN CNN “Predicted Frame” Prefix Suffix High energy Low energy #MakeRyanGoslingTheNewLenna 12
Common Structure in CNNs Intermediate Activations CNN CNN “Key Frame” Prefix Suffix High energy Low energy ≈ Motion Motion CNN CNN “Predicted Frame” Prefix Suffix Low energy #MakeRyanGoslingTheNewLenna 13
Talk Overview Background Algorithm Hardware Evaluation Conclusion 14
Activation Motion Compensation (AMC) Time Input Frame Vision Computation Vision Result Stored Activations Key CNN CNN t Frame Prefix Suffix Predicted Motion Motion CNN t+k Frame Estimation Compensation Suffix Motion Predicted Vector Field Activations 15
Activation Motion Compensation (AMC) Time Input Frame Vision Computation Vision Result Stored Activations Key CNN CNN t Frame Prefix Suffix ~10 11 MACs Predicted Motion Motion CNN t+k Frame Estimation Compensation Suffix ~10 7 Adds Motion Predicted Vector Field Activations 16
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 17
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 18
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 19
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? ? 20
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 21
Motion Estimation • We need to estimate the motion of activations by using pixels … CNN CNN Prefix Suffix Motion Motion CNN Estimation Compensation Suffix Performed on Performed on Pixels Activations 22
Pixels to Activations 3x3 3x3 Conv Conv Input Intermediate Intermediate 64 64 Image Activations Activations 23
Pixels to Activations: Receptive Fields C=64 C=64 C=3 w=h=8 3x3 3x3 Conv Conv Input Intermediate Intermediate 64 64 Image Activations Activations 24
Pixels to Activations: Receptive Fields C=64 C=64 C=3 w=h=8 5x5 “Receptive Field” 3x3 3x3 Conv Conv Input Intermediate Intermediate 64 64 Image Activations Activations • Estimate motion of activations by estimating motion of receptive fields 25
Receptive Field Block Motion Estimation (RFBME) … … Key Frame Predicted Frame 26
Receptive Field Block Motion Estimation (RFBME) 0 1 2 3 0 1 2 3 Key Frame Predicted Frame 27
Receptive Field Block Motion Estimation (RFBME) 0 1 2 3 0 1 2 3 Key Frame Predicted Frame 28
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 29
Motion Compensation C=64 C=64 Vector: X = 2.5 Y = 2.5 Stored Activations Predicted Activations • Subtract the vector to index into the stored activations • Interpolate when necessary 30
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? ? 31
When to Compute Key Frame? • System needs a new key frame when motion estimation fails: • De-occlusion • New objects • Rotation/scaling • Lighting changes 32
When to Compute Key Frame? Input Frame Key Frame • System needs a new key frame when motion estimation fails: Motion Estimation • De-occlusion • New objects Yes No Error > • Rotation/scaling Thresh? • Lighting changes CNN Motion Prefix Compensation • So, compute key frame when RFBME error exceeds set threshold CNN Suffix Vision Result 33
Talk Overview Background Algorithm Hardware Evaluation Conclusion 34
Embedded Vision Accelerator Global Buffer Eyeriss EIE (Conv) (Full Connect) CNN CNN Prefix Suffix Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, 35 “ Eyeriss: An energy- efficient reconfigurable accelerator for deep convolutional neural networks,” “EIE: Efficient inference engine on compressed deep neural network,”
Embedded Vision Accelerator Accelerator (EVA 2 ) Global Buffer Eyeriss EIE EVA 2 (Conv) (Full Connect) Motion Motion CNN CNN Estimation Compensation Prefix Suffix Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, 36 “ Eyeriss: An energy- efficient reconfigurable accelerator for deep convolutional neural networks,” “EIE: Efficient inference engine on compressed deep neural network,”
Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 0 37
Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 0: Key frame 38
Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 1 Motion Estimation 39
Embedded Vision Accelerator Accelerator (EVA 2 ) Frame 1: Predicted frame Motion Motion Estimation Compensation • EVA 2 leverages sparse techniques to save 80-87% storage and computation 40
Talk Overview Background Algorithm Hardware Evaluation Conclusion 41
Evaluation Details Train/Validation Datasets YouTube Bounding Box: Object Detection & Classification Evaluated Networks AlexNet, Faster R-CNN with VGGM and VGG16 Hardware Baseline Eyeriss & EIE performance scaled from papers EVA 2 Implementation Written in RTL, synthesized with 65nm TSMC 42
EVA 2 Area Overhead EVA 2 takes up Total 65nm area: 74mm 2 only 3.3% 43
EVA 2 Energy Savings 1 Input Frame 0.9 Normalized Energy 0.8 0.7 CNN 0.6 Prefix 0.5 0.4 0.3 CNN 0.2 Suffix 0.1 0 orig orig orig Vision Result AlexNet Faster16 FasterM Eyeriss EIE EVA^2 44
EVA 2 Energy Savings Input Frame Key Frame 1 Motion 0.9 Normalized Energy Estimation 0.8 0.7 0.6 0.5 Motion 0.4 Compensation 0.3 0.2 0.1 CNN 0 Suffix orig pred orig pred orig pred AlexNet Faster16 FasterM Vision Result Eyeriss EIE EVA^2 45
EVA 2 Energy Savings Input Frame Key Frame 1 Motion 0.9 Estimation Normalized Energy 0.8 0.7 Yes No Error > 0.6 Thresh? 0.5 0.4 0.3 CNN Motion Prefix Compensation 0.2 0.1 0 orig pred avg orig pred avg orig pred avg CNN Suffix AlexNet Faster16 FasterM Eyeriss EIE EVA^2 Vision Result 46
High Level EVA 2 Results Network Vision Task Keyframe % Accuracy Average Latency Average Energy Degredation Savings Savings AlexNet Classification 11% 0.8% top-1 86.9% 87.5% Faster R-CNN VGG16 Detection 36% 0.7% mAP 61.7% 61.9% Faster R-CNN VGGM Detection 37% 0.6% mAP 54.1% 54.7% • EVA 2 enables 54-87% savings while incurring <1% accuracy degradation • Adaptive key frame choice metric can be adjusted 47
Talk Overview Background Algorithm Hardware Evaluation Conclusion 48
Conclusion • Temporal redundancy is an entirely new dimension for optimization • AMC & EVA 2 improve efficiency and are highly general • Applicable to many different… • CNN applications (classification, detection, segmentation, etc) • Hardware architectures (CPU, GPU, ASIC, etc) • Motion estimation/compensation algorithms 49
EVA 2 : Exploiting Temporal Redundancy In In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018
Backup Slides 51
Recommend
More recommend