Institute of Software,Chinese Academy of Sciences RVTensor: A light-weight neural network inference framework based on the RISC-V architecture Pengpeng Hou, Jiageng Yu, Yuxia Miao , Yang Tai, Yanjun Wu, Chen Zhao *Corresponding author: Jiageng Yu jiageng08@iscas.ac.cn 1
Institute of Software,Chinese Academy of Sciences Introduction § RISC-V ISA is developing rapidly v Open source ISA § RISC-V is suitable for IoT scenes v Basic instruction set + Extended instruction set v IoT scene is fragmented extended1 Basic extended2 extended4 extended3
Institute of Software,Chinese Academy of Sciences Introduction § Popular inference framework v For server : TensorFlow 、 MXNet 、 Caffe v For smart phone : TensorFlow Lite 、 NCNN 、 MNN
Institute of Software,Chinese Academy of Sciences Introduction § Inference system for RISC-V +IoT is few v Architectural limitations F SIMD feature v IoT hardware resource limitations F chip performance is weak F memory capacity is samll Security surveillance camera price statistics Price 90~150 150~775 775< User Rate 34% 37% 29%
Institute of Software,Chinese Academy of Sciences Introduction § RVTensor : RISC-V Tensor v A inference system for RISC-V + IoT scene v Dependent third-party libraries are rarely F only libhd5.so v Less hardware resource requirements v Based on SERVE.r platform
Institute of Software,Chinese Academy of Sciences Overview of RVTensor architecture § RVTensor Platform Overview v Four modules F Model analysis F Op operators F Construction calculation graph F Execution calculation graph
Institute of Software,Chinese Academy of Sciences Overview of RVTensor architecture § RVTensor Platform Overview v Model analysis F It mainly parses model files such as .pb, and extracts information such as operator operations and weight data.
Institute of Software,Chinese Academy of Sciences Overview of RVTensor architecture § RVTensor Platform Overview v Op operators F It mainly includes the implementation of each operator, including conv, add, active, pooling, fc and other operations
Institute of Software,Chinese Academy of Sciences Overview of RVTensor architecture § RVTensor Platform Overview v Construction calculation graph F It builds a calculation graph based on the model analysis and the op operator modules.
Institute of Software,Chinese Academy of Sciences Overview of RVTensor architecture § RVTensor Platform Overview v Execution calculation graph F It obtains the inference results based on the input data (such as image data) and the calculation graph.
Institute of Software,Chinese Academy of Sciences Optimization § R educing dependencies on third-party libraries v Multi-thread library: Pthread F Provide many API F Rvtensor only uses a few
Institute of Software,Chinese Academy of Sciences Optimization § Improving memory utilization v Memory reuse :Share a global memory block when op is running F G lobal memory block = MAX{ op's memory requirement} F Branch phase as atomic operation
Institute of Software,Chinese Academy of Sciences Evaluation § Platform: SERVR.r § Neural network: R esnet20 § Date set : C ifar10
Institute of Software,Chinese Academy of Sciences Evaluation § Accuracy v RVTensor and Keras have the same results Keras runs on X86 platform § Performance v The average time to process each image is 13.51 seconds § Execution file size v The executable file size of RVTensor is 193KB
Institute of Software,Chinese Academy of Sciences Future work § Memory optimization v Due to the limited memory, there will be memory swapping in and out issue § S parse convolution v The Relu op w ould result in lots zeros in the data, it would cause the convolution to be inefficient § Model prunin g v C ompress ing the model parameters through pruning techniques to make them more suitable for IoT scenes § The V instruction set adaptation v R e-implement ing the op operator based on the V instruction set to improv e the efficiency
Institute of Software,Chinese Academy of Sciences Thanks ! *Corresponding author: Jiageng Yu jiageng08@iscas.ac.cn
Recommend
More recommend