deep learning g on mobile le phones a practit itio
play

Deep Learning g on mobile le phones - A Practit itio ionersguid - PowerPoint PPT Presentation

Deep Learning g on mobile le phones - A Practit itio ionersguid ide Anirudh Koul, Siddha Ganju, Meher Kasam Deep Learning g on mobile le phones - A Practit itio ionersguid ide Anirudh Koul, Siddha Ganju, Meher Kasam Anirud rudh h


  1. Com ommon Questions “ Do I need to ship a new app update with every model improvement? ” • Making App updates is a decent amount of overheard, plus ~2 days wait time • Solution : Check for model updates, download and compile on device • Easier solution – Use a framework for Model Management, e.g. • Google ML Kit • Fritz • Numerrical

  2. Com ommon Questions “ Why does my app not recognize objects at top/bottom of screen? ” • Solution : Check the cropping used, by default, its center crop ☺

  3. Build ldin ing a DL App in 1 week ek

  4. Learn Playing an Accordion 3 months

  5. Knows Piano Learn Playing an Accordion 3 months 1 week Fin ine Tune Sk Skil ills

  6. I I go got a da dataset, Now Now Wh What? Step 1 : Find a pre-trained model Step 2 : Fine tune a pre-trained model Step 3 : Run using existing frameworks “Don’t Be A Hero” - Andrej Karpathy

  7. How ow to o find pr pretrained mod odels for or my task? Model Zoo https://modelzoo.co - 300+ models Papers with Code https://paperswithcode.com/sota

  8. AlexNet, 2012 (simplified) n-dimension Feature representation [Krizhevsky , Sutskever,Hinton’12 ] Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Ng, “Unsupervised Learning of Hierarchical Representations with Convo lut ional Deep Belief Networks”, 11

  9. Deciding how ow to o fine tune Size of New Datase set Simi milarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html

  10. Deciding when to o fine tune Size of New Datase set Simi milarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html

  11. Deciding when to o fine tune Size of New Datase set Simi milarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html

  12. Deciding when to o fine tune Size of New Datase set Simi milarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html

  13. Cou ould you ou tra raining you our ow own classifier ... without cod oding? • Microsoft CustomVision.ai • Unique: Under a minute training, Custom object detection (100x speedup) • Google AutoML • Unique: Full CNN training, crowdsourced workers • IBM Watson Visual recognition • Baidu EZDL • Unique: Custom Sound recognition

  14. Cust stom Vis ision Ser Service (customvision.ai) i) – Drag and drop tr training Tip : Upload 30 photos per class for make prototype model Upload 200 photos per class for more robust production model More distinct the shape/type of object, lesser images required.

  15. Cust stom Vis ision Ser Service (customvision.ai) i) – Drag and drop tr training Tip : Use Fatkun Browser Extension to download images from Search Engine, or use Bing Image Search API to programmatically download photos with proper rights

  16. CoreML exporter from customvision.ai – Drag and drop tr training 5 minute shortcut to training, finetuning and getting model ready in CoreML format Drag and drop interface

  17. Build ldin ing g a Crowd wdsou ource ced d Data Colle lect ctor or in 1 months

  18. Ba Barcode recognition from om See Seeing AI Aim : Help blind users identify products using barcode Issue ue : Blind users don’t know where the barcode is Live Guide user in finding a barcode with audio cues With Decode barcode to identify product Server Tech MPSCNN running on mobile GPU + barcode library Metrics 40 FPS (~25 ms) on iPhone 7

  19. Currency recognition from om Se Seeing AI Aim : Identify currency Live Identify denomination of paper currency instantly With - Server Tech T ask specific CNN running on mobile GPU Metrics 40 FPS (~25 ms) on iPhone 7

  20. Training Data Col ollection App pp Request volunteers to take photos of objects in non-obvious settings Sends photos to cloud, trains model nightly Newsletter shows the best photos from volunteers Let them compete for fame

  21. Daily challenge - Col ollected by by vol olunteers

  22. Daily challenge - Col ollected by by vol olunteers

  23. Building ng a producti uction n DL App in 3 months hs

  24. What you want What you can afford $2000 $200,000 https://www.flickr.com/photos/kenjonbro/9075514760/and http://www.newcars.com/land-rover/range-rover-sport/2016

  25. Revolution of Dept pth AlexNet, 8 layers 11x11 conv, 96, /4, pool/2 (ILSVRC 2012) 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384 3x3 conv, 256, pool/2 fc, 4096 fc, 4096 fc, 1000 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015

  26. Revolution of Dept pth soft max2 Soft maxAct ivat ion FC AveragePool 7x7+ 1(V) AlexNet, 8 layers VGG, 19 layers GoogleNet, 22 layers 11x11 conv, 96, /4, pool/2 3x3 conv, 64 Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) 5x5 conv, 256, pool/2 (ILSVRC 2012) (ILSVRC 2014) 3x3 conv, 64, pool/2 (ILSVRC 2014) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat 3x3 conv, 384 3x3 conv, 128 Conv Conv Conv Conv soft max1 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool Soft maxAct ivation 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) 3x3 conv, 384 3x3 conv, 128, pool/2 MaxPool FC 3x3+ 2(S) Dept hConcat FC 3x3 conv, 256, pool/2 3x3 conv, 256 Conv Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) 1x1+ 1(S) Conv Conv MaxPool AveragePool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) 5x5+ 3(V) fc, 4096 3x3 conv, 256 Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) fc, 4096 3x3 conv, 256 Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat soft max0 fc, 1000 3x3 conv, 256, pool/2 Conv Conv Conv Conv Soft maxAct ivat ion 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool FC 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) 3x3 conv, 512 Dept hConcat FC Conv Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) 1x1+ 1(S) 3x3 conv, 512 Conv Conv MaxPool AveragePool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) 5x5+ 3(V) Dept hConcat 3x3 conv, 512 Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) 3x3 conv, 512, pool/2 MaxPool 3x3+ 2(S) Dept hConcat 3x3 conv, 512 Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) 3x3 conv, 512 Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) 3x3 conv, 512 Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) MaxPool 3x3+ 2(S) 3x3 conv, 512, pool/2 LocalRespNorm Conv 3x3+ 1(S) fc, 4096 Conv 1x1+ 1(V) LocalRespNorm fc, 4096 MaxPool 3x3+ 2(S) Conv 7x7+ 2(S) fc, 1000 input Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015

  27. 7x7 conv, 64, /2, pool/2 1x1 conv, 64 3x3 conv, 64 Revolution of Dept pth 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x2 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 11x11 conv, 96, /4, pool/2 AlexNet, 8 layers VGG, 19 layers ResNet, 152 layers 1x1 conv, 256, /2 5x5 conv, 256, pool/2 3x3 conv, 64 3x3 conv, 256 3x3 conv, 384 3x3 conv, 64, pool/2 1x1 conv, 1024 3x3 conv, 384 3x3 conv, 256, pool/2 3x3 conv, 128 1x1 conv, 256 3x3 conv, 128, pool/2 3x3 conv, 256 fc, 4096 3x3 conv, 256 1x1 conv, 1024 fc, 4096 fc, 1000 3x3 conv, 256 1x1 conv, 256 (ILSVRC 2012) (ILSVRC 2014) (ILSVRC 2015) 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256, pool/2 1x1 conv, 1024 3x3 conv, 512 1x1 conv, 256 3x3 conv, 512 3x3 conv, 256 3x3 conv, 512 1x1 conv, 1024 3x3 conv, 512, pool/2 1x1 conv, 256 3x3 conv, 512 3x3 conv, 256 3x3 conv, 512 1x1 conv, 1024 3x3 conv, 512 1x1 conv, 256 3x3 conv, 512, pool/2 3x3 conv, 256 fc, 4096 1x1 conv, 1024 fc, 4096 1x1 conv, 256 fc, 1000 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 Ultra 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 deep 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 512, /2 3x3 conv, 512 1x1 conv, 2048 1x1 conv, 512 3x3 conv, 512 1x1 conv, 2048 1x1 conv, 512 3x3 conv, 512 1x1 conv, 2048 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015 ave pool, fc 1000

  28. Revolution of Dept pth ResNet, 152 layers 7x7 conv, 64, /2, pool/2 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x2 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015 1x1 conv, 512 1x1 conv 128

  29. Revolution of Dept pth vs Classification Accuracy 28.2 25.8 152 layers 16.4 Ensemble of 11.7 Resnet, Inception 19 layers 22 layers Resnet, Inception 7.3 and Wide Residual 6.7 Network 3.6 shallow 8 layers 2.9 ILSVRC'10 ILSVRC'11 ILSVRC'12 ILSVRC'13 ILSVRC'14 ILSVRC'14 ILSVRC'15 ILSVRC'16 AlexNet VGG GoogleNet ResNet Ensemble ImageNet Classification top-5 error (%) Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015

  30. Accuracy vs Ope perations Per r Im Image In Inference What t we want 552 MB Size is proportional to num parameters 240 MB Alfredo Canziani, Adam Paszke, Eugenio Culurciello, “An Analysis of Deep Neural Network Models for Practical Applications” 2016

  31. Your Budget - Sm Smartphone Floating Point Operations Per Se Second (2015) http://pages.experts-exchange.com/processing-power-compared/

  32. iPhone X is mo more powerful th than a Macbook Pro https://thenextweb.com/apple/2017/09/12/apples-new-iphone-x-already-destroying-android-devices-g/

  33. Str Strategies to o get get maximum efficiency from om you our CNN NN Before training • Pick an efficient architecture for your task • Designing efficient layers After training • Pruning • Quantization • Network binarization

  34. CoreML Be Benchmark - Pic Pick a DNN fo for yo your mobile architecture 2013 2014 2015 2016 2017 Model Top-1 Size of Million iPhone 5S iPhone 6 iPhone iPhone 7 iPhone Accura Model Multi Execution Execution 6S/SE Execution 8/X cy (MB) Adds Time (ms) Time (ms) Execution Time (ms) Execution Time (ms) Time (ms) VGG 16 71 553 15300 7408 4556 235 181 146 Inception 78 95 5000 727 637 114 90 78 v3 Resnet 50 75 103 3900 538 557 77 74 71 MobileNet 71 17 569 129 109 44 35 33 SqueezeN 57 5 800 75 78 36 30 29 et Huge improv ovement in GPU hardwa dware in 2015

  35. Mob MobileNet family Splits the convolution into a 3x3 depthwise conv and a 1x1 pointwise conv Tune with two parameters – Width Multiplier and resolution multiplier Andrew G. Howard et al, "MobileNets : Efficient Convolutional Neural Networks for Mobile Vision Applications”, 2017

  36. Ef Efficient Classification Architectures MobileNetV2 is the current favorite https://ai.googleblog.com/2018/04/mobilenetv2-next-generation-of-on.html

  37. Ef Efficient Detection Architectures Jonathan Huang et al, "Speed/accuracy trade- offs for modern convolutional object detectors”, 2017

  38. Ef Efficient Detection Architectures Jonathan Huang et al, "Speed/accuracy trade- offs for modern convolutional object detectors”, 2017

  39. Ef Efficient Seg Segmentation Architectures ICNet - Image cascade network

  40. Tri ricks while des designing you our ow own network • Dilated Convolutions • Great for Segmentation / when target object has high area in image • Replace NxN convolutions with Nx1 followed by 1xN • Depth wise Separable Convolutions (e.g. MobileNet) • Inverted residual block (e.g. MobileNetV2) • Replacing large filters with multiple small filters • 5x5 is slower than 3x3 followed by 3x3

  41. Design con onsideration for or custom architectures – Sm Small Filters Replace large 5x5, 7x7 convolutions with stacks of 3x3 convolutions Replace NxN convolutions with stack of 1xN and Nx1  Fewer parameters ☺  Less compute ☺  More non-linearity ☺ Three layers of 3x3 convolutions One layer of 7x7 convolution >> Better Faster Stronge ger Andrej Karpathy, CS-231n Notes, Lecture 11

  42. Sele Selective tra raining to o keep networks shallow Idea : Augment data limited to how your network will be used Example : If making a selfie app, no benefit in rotating training images beyond +-45 degrees. Your phone will anyway rotate. Followed by WordLens / Google Translate Example : Add blur if analyzing mobile phone frames

  43. Pru Pruning Aim : Remove all connections with absolute weights below a threshold Song Han, Jeff Pool, John Tran, William J. Dally, "Learning both Weights and Connections for Efficient Neural Networks", 2015

  44. Observation : Mos Most pa parameters in Fully Con onnected La Layers 90% of all 96% of all parameters parameters AlexNet 240 MB VGG-16 552 MB

  45. Pr Pruning gets ts quickest model compression wit ithout accuracy lo loss ss First layer which directly interacts with image is sensitive and cannot be pruned too much without hurting accuracy AlexNet 240 MB VGG-16 552 MB

  46. Pru Prune in Kera ras (Before) (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ]) model.compile( optimizer='adam ’ , loss= ‘ sparse_categorical_crossentropy ’ , metrics=['accuracy']) model.fit(x_train, y_train, epochs=5) model.evaluate(x_test, y_test)

  47. Pru Prune in Kera ras (After) (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(), prune.Prune (tf.keras.layers.Dense(512, activation=tf.nn.relu)), tf.keras.layers.Dropout(0.2), prune.Prune (tf.keras.layers.Dense(10, activation=tf.nn.softmax)) ]) model.compile( optimizer='adam ’ , loss= ‘ sparse_categorical_crossentropy ’ , metrics=['accuracy']) model.fit(x_train, y_train, epochs=5) model.evaluate(x_test, y_test)

  48. Weight Sh Sharing Idea : Cluster weights with similar values together, and store in a dictionary. Codebook Huffman coding HashedNets Cons: Need a special inference engine, doesn’t work for most applications

  49. Filt ilter Pr Pruning - ThiN iNet Idea : Discard whole filter if not important to predictions Advantage: • No change in architecture, other than thinning of filters per layer • Can be further compressed with other methods Just like feature selection, select filter to discard. Possible greedy methods: • Absolute weight sum of entire filter closest to 0 • Average percentage of ‘Zeros’ as outputs • ThiNet – Collect statistics on the output of the next layer

  50. Quantization Reduce precision from 32 bits to <=16 bits or lesser Use stochastic rounding for best results In Practice: • Ristretto + Caffe • Automatic Network quantization • Finds balance between compression rate and accuracy • Apple Metal Performance Shaders automatically quantize to 16 bits • Tensorflow has 8 bit quantization support • Gemmlowp – Low precision matrix multiplication library

  51. Quantizing CNN NNs in Pra Practice Reducing CoreML models to half size # Load a model, lower its precision, and then save the smaller model. model_spec = coremltools.utils.load_spec( ‘ model.mlmodel ’ ) model_fp16_spec = coremltools.utils.convert_neural_network_spec_weights_to_fp16(model_spec) coremltools.utils.save_spec(model_fp16_spec, ‘modelFP16.mlmodel' )

  52. Quantizing CNN NNs in Pra Practice Reducing CoreML models to even smaller size Choose bits and quantization mode from coremltools.models.neural_network.quantization_utils import * quantized_model= quantize_weights(model, 8, 'linear') quantized_model.save('quantizedModel.mlmodel ’ ) compare_model(model, quantized_model, './sample_data/') Bits from [1,2,4,8] Quantization mode from [“linear","linear_ lut","kmeans_lut ",” custom_lut ”] • Lut = look up table

  53. Bin Binary weighted Ne Networks Idea :Reduce the weights to -1,+1 Speedup : Convolution operation can be approximated by only summation and subtraction Mohammad Rastegari, Vicente Ordonez, Joseph Redmon , Ali Farhadi, “XNOR - Net: ImageNet Classification Using Binary Convolutional Neural Networks”

  54. Bin Binary weighted Ne Networks Idea :Reduce the weights to -1,+1 Speedup : Convolution operation can be approximated by only summation and subtraction Mohammad Rastegari, Vicente Ordonez, Joseph Redmon , Ali Farhadi, “XNOR - Net: ImageNet Classification Using Binary Convolutional Neural Networks”

  55. Bin Binary weighted Ne Networks Idea :Reduce the weights to -1,+1 Speedup : Convolution operation can be approximated by only summation and subtraction Mohammad Rastegari, Vicente Ordonez, Joseph Redmon , Ali Farhadi, “XNOR - Net: ImageNet Classification Using Binary Convolutional Neural Networks”

  56. XNO XNOR-Net Idea :Reduce both weights + inputs to -1,+1 Speedup : Convolution operation can be approximated by XNOR and Bitcount operations Mohammad Rastegari, Vicente Ordonez, Joseph Redmon , Ali Farhadi, “XNOR - Net: ImageNet Classification Using Binary Convolutional Neural Networks”

Recommend


More recommend