= Prototyping Vision-Based Classifiers in Constrained Environments Ted Hromadka 1 and Cameron Hunt 2 1 Integrity Applications Incorporated ℠, 2 SOFWERX (DEFENSEWERX, Inc.) Presented at GTC 2018 Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Integrity Applications Incorporated SM 2 UNCLASSIFIED = Company Overview • Capabilities – image processing / computer vision applications for US government customers • Number of Employees – around 700, most with MS/PhD • Main locations List cities ? – Chantilly, VA Seattle New England – Dayton, OH Ann Arbor Valley Forge Dayton IAI Office Location Denver St. Louis IAI Work Location – Carlsbad, CA DC Area Colorado Springs IAI Future Work Location Albuquerque (DC Area) – Kihei, HI Las Cruces So. CA Area IAI HQ, Chantilly LA Ft. Belvoir El Segundo Charlottesville Carlsbad Dahlgren San Diego PAX River Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com 15020 Conference Center Drive, Suite 100, Chantilly, VA 20151 • 703 -378- 8672 • www.integrity -apps.com 2
SOFWERX = • SOFWERX performs collaboration, ideation and facilitation with the best minds of Industry, Academia and Government. SOFWERX can also conduct rapid prototyping and rapid proof of concepts from ideation discovery. • Run by DEFENSEWERX (formerly the Doolittle Institute) • Located in Tampa, FL Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Background – requirement to track usage of tank ammunition = • Commanders asked for an automated means of tracking and reporting the firing of the Abrams main gun – Location – Timestamp – Type of ammunition used • Various other means of tracking the ammunition unacceptable due to wear & tear, etc. • Computer vision solution Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Context = • Loader (1) pulls 120mm round from cabinet (5) and loads it into main breech (3) Source: unattributed on multiple websites, appears to be scanned pages from a book Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Concept = • Vision-based classifier • Camera • Processor • GPS and SATCOM links • No impact on tank’s systems • Mounted somewhere inside cabin Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Collecting Training Data = • Raspberry Pi 2B (900 MHz) • 1 GB RAM • RPi camera board v2 – 8 MP = 3280x2464 • 5V USB battery pack (12 hours) • Python script to take and write images to SD card as quickly as possible (~1 Hz) Source: adafruit.com Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Collecting Data - RPi = Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Collecting Data – static photos = • Compact Nikon digital camera • Resolution 4610 x 3460 • Slightly over 1000 photos per class • Wide range of background scenes Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Collecting Data = • Day 2: added GoPro to tank commander’s GPS extension eyepiece • HD video can be matched to RPi quality in post-processing Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Early network = • Initial comparison runs of Caffe and TensorFlow on stock GoogLeNet (Inception v1) – Caffe trained using DIGITS software; TF trained using python – Remainder of this talk will only discuss TF • Initially treated as Image Classification – 4 classes – No need to label bounding boxes – Runs faster than object detection – We never more than one object in scene • Trained on a DevBox-1 (4x TITAN X) Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Why use old version of GoogLeNet? = Network MAC (million) Parameters (million) Inception v1 1550 6.8 Inception v2 3800 (?) Inception v3 5000 23 VGG 16 15300 138 ResNet-50 3900 25.5 AlexNet 720 60 Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Early results (sanity check) = • Model was confidently wrong • Averaged results of 25% mini-batches: TRUTH M829A1 M830 M830A1 M1028 TOTAL ACC % M829A1 270 0 0 0 270 “100%” PREDICTED M830 265 0 4 0 269 0% M830A1 267 0 3 0 270 1% M1028 0 0 0 270 270 100% TOTAL 802 0 7 270 1079 Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Augmented training data = • CATALYST tool – Noise background – Transparent on top of “tank scene” background Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Re-training baseline model = • Still treating as image classification • ~10,000 images per class • Switched from DIGITS to manual Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Misclassified images = • No longer deciding that everything is an M829A1 • Mistakes now due to orientation, possibly also due to shadowing Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Better results = • 99% accuracy on synthetic imagery, 7 6% on “action shots” – Need to incorporate real imagery in next model • Good enough to switch focus to deployment on Raspberry Pi • To build TF on RPi, relied heavily on excellent guide in: https://github.com/samjabrahams/tensorflow-on-raspberry-pi/blob/master/GUIDE.md • Makefile needed for RPi can be found at: https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/contrib/makefile/tf_op_fil es.txt Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
RPi struggled to keep up = • Need to catch a specific 3s critical window over many hours of movement in scene • Evaluated several approaches – Frame grabs • High accuracy, low false positives, but too slow (1/4 fps) – Darknet/YOLO video • Could not run it usefully on RPi – Possibility of hardware trigger from cabinet door opening: discarded due to complexity – Just sending imagery to server for processing there Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
RPi struggled to keep up = • Need to catch a specific 3s critical window over many hours of movement in scene • Evaluated several approaches – Frame grabs • High accuracy, low false positives, but too slow (1/4 fps) – Darknet/YOLO video • Could not run it usefully on RPi – Possibility of hardware trigger from cabinet door opening: discarded due to complexity – Just sending imagery to server for processing there Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
TF model_pruning = https://www.tensorflow.org/versions/master/api_docs • Attempted to simplify network /python/tf/contrib/model_pruning/Pruning down to an RPi level • Exploit sparsity of large model • TensorFlow model_pruning – Threshold & mask – Prune, train(100), repeat • pb reduced from 87.4 MB to 22.4 MB • Sacrifice ~3% model accuracy for ~60% speedup • Still only getting ~1/2 fps on RPi Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
MobileNets = • Very different approach • Small-dense models vs large-sparse [pruned] model (same number of calcs) • Depthwise-separable convolutions followed by 1x1 pointwise convolution • = 1/8 the MAC of a regular convolution • Depending on settings for W and resolution, pb size ranged from 16.7 MB down to 1.9 MB (!) • Peak accuracy was still around 75% https://arxiv.org/pdf/1704.04861.pdf Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
MobileNets tradeoff space = Size on disk (MB) Resolution W • Width multiplier only affected MAC, not parameters count Integrity Applications Incorporated < > 15020 Conference Center Drive Chantilly, VA 20151 • (703) 378-8672 • www.integrity-apps.com
Recommend
More recommend