Deep Learning in the Connected Kitchen or “Launching a Computer Vision program in a new vertical” Hristo Bojinov, CTO
Company Vision
The Problem Food ↔ People disconnect Not-so-smart “smart kitchen” Food info not available, not actionable
What We Do Food, personalization, technology “Give food a voice” ( ⇒ Computer Vision is essential) Icons made by Madebyoliver, Popcorn Arts, Freepik from www.flaticon.com are licensed by CC 3.0 BY
Computer Vision at Innit Helps us understand users ❖ Inventory, behaviors, multi-sensor fusion, market analytics ❖ And, build a delightful user experience Applications in storage and processing ❖ Recognize and act on food state ❖ Visible light, depth, IR
Program Logistics Multi-site program (HQ, academia) Food Recognition service (AWS) ❖ G2 instance backend (blend of CPU and GPU workload) ❖ Frontend orchestrates auto and manual processing ❖ Service API for 3rd party use
CV Tech: Food Recognition System
CV Tech: Food Recognition System
CV Tech: Food Recognition System Data is King!
CV Tech: Object Detection Stage
CV Tech: Object Detection Stage
CV Tech: Object Detection Stage
CV Tech: Object Detection Stage DetectNet ➔ Easy setup and initial training ➔ Python layers, “low resolution” Faster-RCNN ➔ Multi-phase training/tuning ➔ High resolution & recall 😁 DeepMask & SharpMask
CV Tech: Object Detection Stage
CV Tech: Classification Stage
CV Tech: Classification Stage
CV Tech: Classification Stage
CV Tech: Classification Stage Controlled scene layout ⇒ precision In-house data collection and tools Command-line → DIGITS AlexNet → VGG
CV Tech: Product DB Image Retrieval
CV Tech: Product DB Image Retrieval ❖ Exact product (or attribute) matching ❖ KAZE descriptors (GPU acceleration WIP) ➢ Current need to balance CPU/GPU ➢ Order-of-magnitude acceleration ❖ Hierarchical analysis in the pipeline
CV Research: Training on Synthetic Sets
CV Research: Text Extraction
In a nutshell... ❖ Focus on differentiated capabilities, in the food space ❖ Tie in with all stages of human ↔ food interaction ❖ Fusion of images & other “sensors” ❖ GPU tech a strong enabler
Takeaways ❖ Objectives → domain constraints ( good! ) ❖ Sources of initial training+test data; build tools ❖ Hardware (local experiments OK, cloud for serving) ❖ Software (don’t get tied to a framework; abstract away)
We are hiring! 🚁 hristo@innit.com
About Innit ❖ Inform and elevate the interaction between people and food ❖ 4+ years in the making, substantial funding, IP & tech ❖ Pirch SOHO, ShopWell About the Speaker ❖ Embedded & Security ❖ Android, Computer Vision ❖ Computer technology at Innit
Recommend
More recommend